= An Introduction to MadAnalysis5 and Recasting LHC Results Part 3: Writing an analysis = This is a step-by-step introduction for using !MadAnalysis 5 on reconstructed level event files. It consists of three parts: * [https://madanalysis.irmp.ucl.ac.be/wiki/Ma5PadStepByStep Part 1: Installation] * [https://madanalysis.irmp.ucl.ac.be/wiki/UsingMa5Pad Part 2: Using the Public Analysis Database] * [https://madanalysis.irmp.ucl.ac.be/wiki/WritingAnalyses Part 3: Writing an analysis] Here we continue with Part 3. This part assumes that you have [https://madanalysis.irmp.ucl.ac.be/wiki/Ma5PadStepByStep properly installed MadAnalysis5 at Reco level]. You may want to start with [https://madanalysis.irmp.ucl.ac.be/wiki/UsingMa5Pad using an implemented analysis in MA5] first before writing your own. We will now write our own CAT-UED-16-013 analysis. More detailed documentation about how to do this and accessible variables and functions can also be found in [http://arxiv.org/abs/1405.3982 arXiv:1405.3982]. Many observables and functions are already implemented in [http://inspirehep.net/record/1301484/ the MA5 implementation of CMS-SUS-13-011], which can be used as a guide for writing a more complicated analysis than what is done here. == Creating the directory == This proceeds as for [https://madanalysis.irmp.ucl.ac.be/wiki/UsingMa5Pad using an implemented analysis in MA5]. Again, we will create a directory from within the interactive !MadAnalysis mode: {{{ ./bin/ma5 -R -E }}} We will call the analysis directory MINE. The name of the analysis is asked. We will call it cat_ued_16_013. A directory my_analysis is created with these files: {{{ MINE/Build/SampleAnalyzer/User/Analyzer/cat_ued_16_013.cpp MINE/Build/SampleAnalyzer/User/Analyzer/cat_ued_16_013.h }}} The cat_ued_13_016.h file can stay as is. We will now edit the .cpp file to write our own analysis. == Defining signal regions and cuts == We will now edit cat_ued_13_016.cpp to write our own analysis CAT-UED-16-013. In this file, you can find an Initialize, Finalize, and Execute part. In the Initialize part, we will define our signal regions and cuts. In the Execute part, we will write our code. === Signal regions === Suppose that our CAT-UED-16-013 contains signal regions in MET and MHT with 200 < MET < 500, MET > 500 and 100 < HT < 500, HT > 500. This results in 4 signal regions. These are defined with the syntax {{{ Manager()->AddRegionSelection("signal_region_name") }}} in the cat_ued_16_013.cpp file in the initialize part, that is, it will look like: {{{ // ----------------------------------------------------------------------------- // Initialize // function called one time at the beginning of the analysis // ----------------------------------------------------------------------------- bool cat_ued_16_013::Initialize(const MA5::Configuration& cfg, const std::map& parameters) { cout << "BEGIN Initialization" << endl; // initialize variables, histos // Initialize signal regions Manager()->AddRegionSelection("200AddRegionSelection("MET>500,100AddRegionSelection("200500"); // SR3 Manager()->AddRegionSelection("MET>500,HT>500"); // SR4 }}} === Cuts === In this same initialize block, so immediately after, we define the cuts. Note that the order in which the cuts are defined, is the order in which the cuts will appear in the cutflow. If this does not match the order in which the cuts are applied, the names of cuts in the cutflow results will be wrong! Here we differentiate between baseline cuts and signal region specific cuts. The baseline cuts in this case are: * HT > 100 GeV; * MET > 200 GeV; * a lepton veto for leptons with pT > 10 GeV; * jets must have a pT > 20 GeV. Furthermore, we have the region specific cuts, namely those that define the signal regions. The baseline cuts are defined as follows: {{{ // Baseline cuts Manager()->AddCut("nolepton"); Manager()->AddCut("MET>200"); Manager()->AddCut("HT>100"); }}} The signal-region specific cuts are defined as: {{{ // Signal-region specific cuts // MET < 500 for signal regions 1 and 3: string SR_1_3[] = {"200500"}; Manager()->AddCut("MET<500", SR_1_3); // HT < 500 applies to signal regions 1 and 2: string SR_1_2[] = {"200500,100AddCut("HT<500", SR_1_2); // HT > 500 applies to signal regions 3 and 4: string SR_3_4[] = {"200500", "MET>500,HT>500"}; Manager()->AddCut("HT>500", SR_3_4); // MET > 500 for signal regions 2 and 4: string SR_2_4[] = {"MET>500,100500,HT>500"}; Manager()->AddCut("MET>500", SR_2_4); }}} == The code == Now that we have defined everything we need in the Initialize part of cat_ued_16_013.cpp, we will write the code in the Execute part. === Define objects === First, we will select objects, in this case leptons (electrons and muons) and jets. This is done in the Execute block. Since we now work with a ROOT file, not an STDHEP file (as is shown in the example in the file, after the start of the Execute block: {{{ // ----------------------------------------------------------------------------- // Execute // function called each time one event is read // ----------------------------------------------------------------------------- bool cat_ued_16_013::Execute(SampleFormat& sample, const EventFormat& event) { }}} we will remove all code and write the following for events at the reconstructed level: {{{ // *************************************************************************** // Example of analysis with reconstructed particles // Concerned samples : ROOT // *************************************************************************** double myEventWeight; if(Configuration().IsNoEventWeight()) myEventWeight=1.; else if(event.mc()->weight()!=0.) myEventWeight=event.mc()->weight(); else { INFO << "Found one event with a zero weight. Skipping...\n"; return false; } Manager()->InitializeForNewEvent(myEventWeight); if (event.rec()!=0) { cout << "---------------NEW EVENT-------------------" << endl; }}} Most notably, here we changed the event.mc() to event.rec(). Immediately after the above lines, we declare the jet and lepton objects needed for our analysis. These are reconstructed objects that can be read from the Delphes ROOT file: {{{ // =====first declare the empty containers:=====¬ vector electron, muon; vector jet; }}} Now collect the electrons with a pT greater than 10 GeV: {{{ // =====fill the electrons container:===== for(unsigned int e=0; eelectrons().size(); e++) { const RecLeptonFormat *CurrentElectron = &(event.rec()->electrons()[e]); double pt = CurrentElectron->momentum().Pt(); if(!(pt>10)) continue; electron.push_back(CurrentElectron); } }}} Collect the muons: {{{ // =====fill the muons container:===== for(unsigned int m=0; mmuons().size(); m++) { const RecLeptonFormat *CurrentMuon = &(event.rec()->muons()[m]); double pt = CurrentMuon->momentum().Pt(); if(!(pt>10)) continue; muon.push_back(CurrentMuon); } }}} and collect jets, calculating HT on the way: {{{ // =====define HT====== double HT = 0; // =====fill the jet containers and order===== for(unsigned int j=0; jjets().size(); j++) { const RecJetFormat *CurrentJet = &(event.rec()->jets()[j]); double pt = CurrentJet->momentum().Pt(); if(!(pt > 20)) continue; // Add pt to HT: HT += pt; jet.push_back(CurrentJet); } }}} Define the missing energy MET: {{{ // =====Get the missing ET===== TLorentzVector pTmiss = event.rec()->MET().momentum(); double MET = pTmiss.Pt(); }}} In this same block, we now apply the cuts. === Apply cuts === Very important: In order to have the names in the cutflow results (that are given per region) match the corresponding cut, the cuts MUST be applied in the same order as they are defined! First, we apply the lepton veto: {{{ // =====Apply no-lepton cut======== if(!Manager()->ApplyCut((electron.size()+muon.size() == 0),"nolepton")) return true; }}} the same for the other baseline cuts: {{{ // =====Apply HT100 cut============ if(!Manager()->ApplyCut((HT > 100),"HT>100")) return true; // =====Apply MET200 cut=========== if(!Manager()->ApplyCut((MET > 200),"MET>200")) return true; }}} Finally, apply the signal-region specific cuts: {{{ // =====Apply signal region cuts======= Manager()->ApplyCut(MET < 500, "MET<500"); // SR1, SR3 Manager()->ApplyCut(MET >= 500, "MET>500"); // SR2, SR4 Manager()->ApplyCut(HT < 500, "HT<500"); // SR1, SR2 Manager()->ApplyCut(HT>=500, "HT>500"); // SR3, SR4 }}} and close the block and end the file with: {{{ // =====finished===== return true; } } }}} You can also find the [http://phys.onmybike.nl/ma5/cat_ued_16_013.cpp CAT-UED-16-013 C++ analysis file here]. For testing, you can use [http://phys.onmybike.nl/ma5/testfile.root this rootfile]. == The signal region information == The cat_ued_16_013.info file contains the information about observed and numbers of background events in our signal regions. This is needed when one wants to compute confidence levels or upper limits using exclusion_CLs.py from the !RecastingTools package. For our analysis we have a luminosity of 20fb-1, and four each of the four signal regions we have observed numbers of events (nobs), estimated backgrounds (nb), and background errors (deltanb) as follows: * Signal region 1, 200500,100500: nobs=451, nb=418.0, deltanb=66.0 * Signal region 4, MET>500,100, number of expected backgrounds and number of background uncertainty in each of the regions --> 20.0 6159 6090.0 670.0 2305 2280.0 270.0 454 418.0 66.0 62 57.4 11.2 }}}