Version 10 (modified by 5 years ago) ( diff ) | ,
---|
HistFactory Input to calculate Combined SR exclusion limits
HistFactory JSON files can be attached to the analysis to be able to estimate exclusion limits of combined likelihoods and global likelihoods. JSON files need to be under the same folder with info file and required information has to be added into the info file as shown below. Note that this is an additional subelement to the <analysis> main element which is described in here.
<pyhf id="RegionA"> <name>atlas_susy_2018_031_SRA.json</name> <regions> <channel name="SR_meff"> SRA_L SRA_M SRA_H </channel> <channel name="VRtt_meff"></channel> <channel name="CRtt_meff"></channel> </regions> </pyhf>
where <pyhf id="RegionA"> is the identifier of the profile and will be printed in the output file to show the exclusion estimates calculated using this specific likelihood profile. It can be named as anything without using spaces. <name>atlas_susy_2018_031_SRA.json</name> is the name of the HistFactory JSON file. <channel name="SR_meff"> is the name of the channel as specified in the JSON file, please note that in case of the wrong declaration the profile will be ignored. In the example above, the channel SR_meff has 3 signal regions declared as SRA_L, SRA_M and SRA_H. These names correspond to the names of the signal regions as declared in the analysis recast. The ordering MUST be the same as in the JSON file, otherwise, the exclusion limit will be calculated wrong. To make sure, please refer to the analysis description. VRtt_meff and CRtt_meff does not contain any signal region since the Validation and Control regions are not included in the analysis recast due to lack of information. If further help needed, one can use write_histfactory_info.py
python write_histfactory_info.py -i FILE1.json FILE2.json FILE3.json
where -i refers to the interactive mode, which writes the file by giving you directions. The JSON files have to be named as in info file and placed in the same folder with the info file (~/madanalysis5/tools/<PADofChoice>/Build/SampleAnalyzer/User/Analysis/atlas_xyz_00_00.info)
In order to use HistFactory, one needs to install pyhf package which is automatically installed via the following command
install pyhf
After these steps, all given signal region combinations are automatically calculated. Additionally, MadAnalysis constructs a global likelihood profile to combine all given HistFactory files with the same parameter of interest.
How to write <pyhf> information with write_histfactory_info.py as declared above:
$$ python write_histfactory_info.py -i atlas_susy_2018_031_SRA.json Writing SR_meff... Please write the name of 3 signal region from the analysis corresponding to the following observed values: 12.0, 3.0, 2.0 >>SRA_L SRA_M SRA_H Writing VRtt_meff... Please write the name of 3 signal region from the analysis corresponding to the following observed values: 210.0, 62.0, 22.0 >> Please note that number of SR does not match... Writing CRtt_meff... Please write the name of 3 signal region from the analysis corresponding to the following observed values: 153.0, 52.0, 19.0 >> Please note that number of SR does not match...
HistFactory/pyhf FAQ
- Where can I find the HistFactory data?
JSON files are generally given in HEPData, under the resources of the analysis in question.
- Where should I add the JSON files?
JSON files should be included with you cpp file, we encourage you to upload them in inspire alongside with the analysis code. Please change their names as indicated in info file before uploading.
- pyhf installation is failing, how can I fix this?
pyhf has different dependencies besides MadAnalysis. Requirements can be installed via
pip install click tqdm six jsonschema jsonpatch pyyaml
After installing those packages, please try to install pyhf again.
Combining SR using covariance matrices with the simplified likelihood method
Covariance matrices provided for some CMS SUSY searches, can be used to build an approximate simplified likelihood. info
files from the Public Analysis Database can be extended with the covariance information from which MadAnalysis5 builds a simplified likelihood. This allows to compute combined CLs and combined cross-section upper limits. The standard syntax of the info
file
<analysis id="analysis name"> <region type="signal" id="region name"> <nobs> ... </nobs> <nb> ... </nb> <deltanb> ... </deltanb> </region> ... </analysis>
specifying, for each SR, the number of observed events <nobs>
, expected background events <nb>
and their uncertainty <deltanb>
, is therefore extended by adding in each <region>
subelement, the successive covariance values with respect to all other regions:
<analysis id="analysis name" cov_subset="combined SRs"> <region type="signal" id="region name"> <nobs> ... </nobs> <nb> ... </nb> <deltanb> ... </deltanb> <covariance region="first SR name">...</covariance> <covariance region="second SR name">...</covariance> ... <covariance region="last SR name">...</covariance> </region> ... </analysis>
where, for each <covariance>
element, the associated region is specified with the region
attribute. Every missing covariance value will be interpreted as a zero element in the covariance matrix. If a <region>
subelement does not contain any covariance values, then it won't be included in the set of combined regions. This allows to combine only a subset of signal regions. For instance, CMS-SUS-16-039 only provides covariances for signal regions of type A.
In addition, an attribute cov_subset
must be added to the <analysis>
main element to store information about which SRs subset is combined. In the case of CMS-SUS-16-039, this is:
<analysis id="cms_sus_16_039" cov_subset="SRs_A">
The susbset description will be printed to the output file with the results from simplified likelihood combination, after the usual exclusion information:
<set> <tag> <cov_subset> <exp> <obs> <CLs> ||
The successive elements consist of the dataset name, the analysis name, the description of the subset of combined SRs, the expected and observed cross section upper limits at 95% confidence level (CL), and finally the exclusion level, 1-CLs. A concrete example reads
defaultset cms_sus_16_039 [SL]-SRs_A 10.4851515 11.1534040 0.9997 ||
where [SL]
stands for simplified likelihood.
Attachments (7)
-
cms_sus_17_001.info
(861 bytes
) - added by 5 years ago.
Example of info file
-
write_histfactory_info.py
(3.4 KB
) - added by 4 years ago.
info file helper for HistFactory input.
-
pyhf_python2.tar.gz
(46.4 KB
) - added by 4 years ago.
pyhf Ma5 Tune for python 2
-
pyhf_py3.tgz
(92.9 KB
) - added by 4 years ago.
pyhf python 3 version 0.5.4
-
simplify.tar.gz
(1.2 MB
) - added by 3 years ago.
Simplify full profile likelihoods
-
simplify-master.zip
(447.1 KB
) - added by 3 years ago.
Simplify full profile likelihoods
-
simplify.tgz
(1.2 MB
) - added by 3 years ago.
Simplify full profile likelihoods