wiki:Reweight
Last modified 4 months ago Last modified on 08/03/17 12:03:09

Description of the method

The method consists in using a sample of events (weighted or unweighted) generated under a certain theoretical hypothesis (a model and its parameters with given values), and in associating with those events an additional weight that corresponds to a new theoretical hypothesis (a different model, and/or different parameter choices); both the original and the additional weights are thus based solely on matrix-element computations. Once computed, the additional weight can be propagated through all of the simulation chain, and saves one from performing eg a full simulation on an additional event sample. The method works only if both the original and the new hypothesis give non-negligible contributions to the same parts of the phase-space.

We support four types of reweightings, one for Leading Order (LO) samples, and three for Next-to-Leading Order (NLO) samples. One of them is NLO accurate (dubbed NLO re-weighting) and two are approximate using LO method (LO like Reweighting and Loop Improved)

Leading Order

At the Leading Order, the new weight is given by

$$W_{new} = |M^{new}_h|^2 /|M^{old}_h|^2 * W_{old} $$
where h is the helicity associated with the event, and $|M^{new/old}_h|^2$ is the matrix element for the corresponding helicity. If the event is not associated with a specific helicity, then the sum over helicities is used instead.

This method is fully LO accurate and does not present any bias. Note that the statistical fluctuations of the original sample can be increased by reweighting. To have an idea of such an increase, one can use the naive formula of propagation of errors:

$$\Delta\mathcal{O}_{new} = \bar R\cdot \Delta\mathcal{O}_{old} + \Delta R \cdot \mathcal{O}_{old} $$

where $\bar R$ is the average of the ratio of the matrix-element, $\Delta R$ the associated variance. $\mathcal{O}_{old/new}$ is the value of the observable under consideration for the associated hypothesis and $\Delta\mathcal{O}_{old/new}$ the associated variance.

LO like Reweighting

This corresponds to a LO-type reweighting. Both soft and hard events are reweighted according to the tree-level matrix elements associated with the suitable number of final-state particles i.e.,

$$W^S_{new} = |M^{new}_{born}|^2 /|M^{old}_{born}|^2 * W^S_{old} $$
$$W^H_{new} = |M^{new}_{real}|^2 /|M^{old}_{real}|^2 * W^H_{old} $$

For obvious reasons this method is, in general, not NLO accurate. It is available since MadGraph5_aMC@NLO v2.3.2.

NLO reweighting:

For this procedure, we employ the method introduced in http://arxiv.org/pdf/1110.4738v1.pdf to decompose the matrix elements in terms of scale- and PDF-independent coefficients:

$$d\sigma^{H} = d\sigma^E - d\sigma^{MC} $$
$$ d\sigma^{S} = d\sigma^{MC} + \sum_{\alpha=S,C,SC} d\sigma^\alpha $$

where the $d\sigma^\alpha$ terms are written as

$$ d\sigma^\alpha=f_1(x_1,\mu_F)f_2(x_2,\mu_F) \left[\mathcal{W}^\alpha_0 + \mathcal{W}^\alpha_F log\left(\mu_F/Q\right)^2 + \mathcal{W}^\alpha_R log\left(\mu_R/Q\right)^2 \right] d\chi$$

and where each of the $\mathcal{W^\alpha_\beta}$ terms is given in terms of coefficients proportional to the Born ($\mathcal{W}^\alpha_{\beta,B}$), to the finite piece of virtual ($\mathcal{W}^\alpha_{\beta,V}$), and to the real ($\mathcal{W}^\alpha_{\beta,R}$) contributions.

$\mathcal{W^\alpha_\beta} = B*\mathcal{C}^\alpha_{\beta,B} + V*\mathcal{C}^\alpha_{\beta,V} + R*\mathcal{C}^\alpha_{\beta,R} \equiv \mathcal{W}^\alpha_{\beta,B} + \mathcal{W}^\alpha_{\beta,V} + \mathcal{W}^\alpha_{\beta,R}$

The various $\mathcal{W}^\alpha_{\beta,\delta}$ terms are computed by MG5_aMC@NLO at running time, and kept in the event record. More details on the decomposition are available in the appendix of http://arxiv.org/pdf/1110.4738v1.pdf (and in a paper in preparation).

The reweighting is performed on each sub-part of the $\mathcal{W}$'s according to the following formulae (dropping the $\alpha$ and $\beta$ index for simplicity):

$$\mathcal{W}_B^{new} = \frac{B^{new}}{B^{old}} * \mathcal{W}_B^{old} $$
$$\mathcal{W}_V^{new} = \frac{V^{new}}{V^{old}} * \mathcal{W}_V^{old} $$
$$\mathcal{W}_R^{new} = \frac{R^{new}}{R^{old}} * \mathcal{W}_R^{old} $$

with the final weight computed by recombining these weights according to the prescription given before. In MadGraph5_aMC@NLO, we have not implemented this re-weighting.

One potential problem of this method is related to the procedure adopted in the computation of the virtual contribution (see sect.2.4.3 http://arxiv.org/pdf/1405.0301.pdf). This speed optimisation method can easily increased the statistical error associated to a sub-sample of events. To limit such effect we proposed (and implemented) a second reweighting method. The difference between those two methods should be seen as a systematics. For this re-weighting, $\mathcal{W}_B$ is split in two pieces :$\mathcal{W}_{BC}$, $\mathcal{W}_{BB}$. $\mathcal{W}_{BC}$ is the part, proportional to the Born, related to the one of the counterterms, while $\mathcal{W}_{BB}$ includes all of the other contributions (the Born itself and the approximate virtual).

The reweighting is then carried out as follows:

$$\mathcal{W}_{BB}^{new} = \frac{(B^{new}+V^{new})}{(B^{old}+V^{old})} * \mathcal{W}_{BB}^{old} $$
$$\mathcal{W}_{BC}^{new} = \frac{B^{new}}{B^{old}} * \mathcal{W}_{BC}^{old} $$
$$\mathcal{W}_V^{new} = \frac{(B^{new}+V^{new})}{(B^{old}+V^{old})} * \mathcal{W}_V^{old} $$
$$\mathcal{W}_R^{new} = \frac{R^{new}}{R^{old}} * \mathcal{W}_R^{old} $$

Such reweighting is fully NLO accurate as well.

This method will be released in a future version of MadGraph5_aMC@NLO and can currently be provided on request. Since it is based on a dedicated decomposition, the NLO sample must be generated in a specific way for the Les Houches event file to contain the necessary information (see below).

Loop Improved:

As the NLO re-weighting this method use the NLO decomposition but applies the following re-weighting procedure

$$\mathcal{W}_B^{new} = \frac{B^{new}}{B^{old}} * \mathcal{W}_B^{old} $$
$$\mathcal{W}_V^{new} = \frac{B^{new}}{B^{old}} * \mathcal{W}_V^{old} $$
$$\mathcal{W}_R^{new} = \frac{R^{new}}{R^{old}} * \mathcal{W}_R^{old} $$

with the final weight computed by recombining these weights according to the prescription given in the NLO descriptions.

This method is mainly intended for case where the virtual can not be computed as in the case where the Born of the new method is a loop-induced production (justifying the name). As the LO like re-weighting it is not NLO accurate. One can argue that it should be more accurate than the LO like method but this statement was never proven anywhere so far.

This method will be released in a future version of MadGraph5_aMC@NLO and can currently be provided on request. Since it is based on a dedicated decomposition, the NLO sample must be generated in a specific way for the Les Houches event file to contain the necessary information (see below).

Technical details

Limitation

  1. Changes of PDFs and/or of cuts are not permitted with these methods of reweighting.
  2. Likewise, changes are not allowed in the functional forms used to compute the hard scales, and hence alpha_S
  3. In the presence of a decay chain, the order of the particles in the event file is important, and especially so with LHE events not produced by MadGraph5_aMC@NLO.

Installation

This module is built-in in MadGraph5_aMC@NLO.2.3.2 and later. It relies on f2py; the easiest way to install f2py is to install numpy (if not already done). For NLO and Loop-improved re-weighting it also relies on lhapdf.

Running the code

Running simultaneously with event generation

When running event generation at the LO or NLO (either via ./bin/generate_events from the local directory or by executing "launch" through the MG5_aMC@NLO interface), you will be asked two questions. The phrasing/options of those two questions depend on whether you run at the LO or NLO, but both follow the same strategy. Here we will take the example of an NLO generation. In that case, the first question is:

The following switches determine which operations are executed:
 1 Perturbative order of the calculation:                               order=NLO
 2 Fixed order (no event generation and no MC@[N]LO matching):    fixed_order=OFF
 3 Shower the generated events:                                        shower=ON
 4 Decay particles with the MadSpin module:                           madspin=OFF
 5 Add weights to the events based on changing model parameters:     reweight=OFF
  Either type the switch number (1 to 5) to change its default setting,
  or set any switch explicitly (e.g. type 'order=LO' at the prompt)
  Type '0', 'auto', 'done' or just press enter when you are done.
 [0, 1, 2, 3, 4, 5, auto, done, order=LO, ... ][60s to answer]

As you can see, the question presents a series of switches which can take different value (in the example "NLO", "ON", "OFF"). In order to perform the reweighting, you need to put the reweight switch to "ON". Type

reweight=ON

You can also just type "5" but please avoid to use this mode in scripts. After hitting the <enter> key, the question is asked again and you now should have:

The following switches determine which operations are executed:
 1 Perturbative order of the calculation:                               order=NLO
 2 Fixed order (no event generation and no MC@[N]LO matching):    fixed_order=OFF
 3 Shower the generated events:                                        shower=ON
 4 Decay particles with the MadSpin module:                           madspin=OFF
 5 Add weights to the events based on changing model parameters:     reweight=ON
  Either type the switch number (1 to 5) to change its default setting,
  or set any switch explicitly (e.g. type 'order=LO' at the prompt)
  Type '0', 'auto', 'done' or just press enter when you are done.
 [0, 1, 2, 3, 4, 5, auto, done, order=LO, ... ][60s to answer]

This allows you to change any other switch (note that "fixed_order" needs to stay on OFF). You can type <enter> when you want to pass to the next question:

Do you want to edit a card (press enter to bypass editing)?
  1 / param      : param_card.dat
  2 / run        : run_card.dat
  3 / reweight   : reweight_card.dat
  4 / shower     : shower_card.dat
 you can also
   - enter the path to a valid card or banner.
   - use the 'set' command to modify a parameter directly.
     The set option works only for param_card and run_card.
     Type 'help set' for more information on this command.
   - call an external program (ASperGE/MadWidth/...).
     Type 'help' for the list of available command
 [0, done, 1, param, 2, run, 3, reweight, 4, enter path, ... ][60s to answer]

If you want to perform the NLO-accurate reweighting, you might want to have the parameter "store_rwgt_info" of the run_card on True. Note that this parameter will be automatically switched to True if you request at the time of the generation a NLO accurate re-weighting (or the loop-improved one). You are force to set this parameter manually if you want to perform such re-weighting independently of the event generation (otherwise only the LO like reweighting will be available).

Then type

3

to open an editor (in most systems this is vi) where you can edit the content of the reweight_card. The format/options of such a file are described below, and at the beginning of the file itself. The card allows you to specify which model/benchmark you want to use. When you are done, exit the file and press <enter>.

The code will then start the event generation and when done will directly run the reweighting.

Running the code after the generation of events has been completed.

In order to run the reweighting on previously-generated samples, you need to go to the relevant process directory and run either the ./bin/madevent or the ./bin/aMC@NLO script for LO or NLO event generation respectively. You can then type reweight RUN_NAME (RUN_NAME is typically run_01) and you will be asked the same questions as above.

Another option is to manually edit the Cards/reweight_card.dat file and then run either of the two following commands:

./bin/madevent reweight RUN_NAME -f
./bin/aMC@NLO reweight RUN_NAME -f

for the LO and NLO cases respectively.

Content of the reweight_card

This card is composed of two sections:

  1. Options:
    These are options that control the behaviour of the reweighting. The lines below must be specified before the first 'launch' command in order to be effective.
    1. change model <XXX> performs the reweighting with a new model (you then need to provide a full param_card and not the difference between two cards).
    2. change process <XXX> change the process definition.
    3. change process <XXX> --add add one process definition to the new list.
    4. change output <i>: Three options: 'default'(i.e. lhef version3 format), '2.0' (i.e. lhef version2 format, the main weight is replaced), 'unweight' (a new unweighting is applied on the events sample).
    5. change helicity <True|False>: performs reweighting for the given helicity (True --default--) or carry out the sum over helicity (False).
    6. change rwgt_dir <PATH>: changes directory where the computation is performed. This can be used to avoid to recreate/recompile the fortran executable if pointing to a previously existing directory.
    7. change mode <LO|NLO|NLO_tree|LO+NLO>: For LO sample, this line is always ignored. For NLO samples this select the reweighting mode "LO" for the LO like reweighting method, "NLO" for the NLO accurate method, "NLO_tree" for the loop-improved and "LO+NLO" to run simultaneously the LO like and NLO method.
    8. change tree_path PATH: Allow to use an external library of standalone matrix-element (see details here) for the new theory (all tree contribution: born and real)new in 2.5.6
    9. change virtual_path PATH: Allow to use an external library of standalone matrix-element (see details here) for the new theory for the virtual contribution. new in 2.5.6
    10. change systematics OPTS: only for ouptut mode '2.0' , allows to run the systematics computation on the new flight with options "OPTS". Note that this assume that the orignal and final theory have the SAME power of alpha_s at born level. new in 2.5.6
  1. Benchmark definition:
    A benchmark is a given set of parameters within the chosen model. You may create a new benchmark by starting with the command line:
    launch [--rwgt_name=XXX]
    
    The optional flag "--rwgt_name" allows you to choose the name of the weight written in the lhe file. After this command, you can issue a series of "set" commands to specify how to edit the param_card from the original one. There are a couple of equivalent options:
     set mt 150
     set mass 6 150
    
    Rather than using the "set" command, you can also specify the path to a (valid) param_card
     PATH
    
    For a scan over parameters:
    1. You can have multiple command lines "launch" in the file, each of them followed by the associated "set" commands.
    2. You can use the scan syntax of MadGraph5_aMC@NLO as in the following examples:
       set mt 6 scan:range(100,200,20)
       set mass 6 scan:[100,120,140,160,180]
      

Generate external library

In order to generate a pre-defined external library, you can run the following command (from 2.5.6):

output standalone_rw —prefix=int PATH

If one wants to modify the content of such type of library. The import file is the presence of the file SubProcesses/allmatrix2py.so This library contains the full information about the matrix-element and the associate way to access those: it should contains the following functions:

  1. ans = smatrixhel(pdgs,p,alphas,scale2,nhel): return the matrix-element for the given pdgs (no permutation is checked at this level)
  2. initialise(path) : load the param_card
  3. pdg = get_pdg_order(): return the list of pdg supported by the module
  4. prefix = get_prefix(): return the list of prefix (same order as for the associate pdg)

Example of reweight_card.dat

LO example

launch --rwgt_name=op1
   set Dim6 1 1
   set Dim6 2 0
   set Dim6 3 0
   set Dim6 4 0
   set Dim6 5 0
launch --rwgt_name=op2
   set Dim6 1 0
   set Dim6 2 1
   set Dim6 3 0
   set Dim6 4 0
   set Dim6 5 0
launch --rwgt_name=op3
   set Dim6 1 0
   set Dim6 2 0
   set Dim6 3 1
   set Dim6 4 0
   set Dim6 5 0

or with the same with the scan syntax

launch
   set Dim6 1 scan1:[1.0 if i==0 else 0.0 for i in range(3)]
   set Dim6 2 scan1:[1.0 if i==1 else 0.0 for i in range(3)]
   set Dim6 3 scan1:[1.0 if i==2 else 0.0 for i in range(3)]
   set Dim6 4 0
   set Dim6 5 0

LO like reweighting

change mode LO
launch
  set Dim6 1 scan1:[1.0 if i==0 else 0.0 for i in range(3)]
  set Dim6 2 scan1:[1.0 if i==1 else 0.0 for i in range(3)]
  set Dim6 3 scan1:[1.0 if i==2 else 0.0 for i in range(3)]
  set Dim6 4 0
  set Dim6 5 0

NLO reweighting

change mode NLO
launch
  set Dim6 1 scan1:[1.0 if i==0 else 0.0 for i in range(3)]
  set Dim6 2 scan1:[1.0 if i==1 else 0.0 for i in range(3)]
  set Dim6 3 scan1:[1.0 if i==2 else 0.0 for i in range(3)]
  set Dim6 4 0
  set Dim6 5 0

Reweighting HEFT sample by loop induced processes (LO sample)

change model loop_sm
change process g g > h [QCD] 
launch
./models/loop_sm/param_card_default.dat

Reweighting HEFT sample by loop induced processes (NLO sample)

change mode NLO_tree
change model loop_sm
change process g g > h [sqrvirt=QCD]
change process p p > h j [sqrvirt=QCD] --add
launch
./models/loop_sm/param_card_default.dat

Output format

The output format complies with the Les Houches agreement version 3 (see http://arxiv.org/abs/arXiv:1405.1067) For example, the header looks like this:

<initrwgt>
<weightgroup type='mg_reweighting'>
<weight id='operator_1'>set param_card dim6 1 100.0
</weight>
<weight id='mg_reweight_1'>set param_card dim6 2 100.0
</weight>
<weight id='mg_reweight_2'>set param_card dim6 3 100.0
</weight>
</weightgroup>
</initrwgt>

and one associated event:

<event>
 8      0 +7.9887000e-06 1.24664300e+02 7.95774700e-02 1.23856500e-01
        1 -1    0    0  501    0 +0.0000000e+00 +0.0000000e+00 +1.3023196e+03 1.30231957e+03 0.00000000e+00 0.0000e+00 -1.0000e+00
       -2 -1    0    0    0  501 +0.0000000e+00 +0.0000000e+00 -1.4499581e+02 1.44995814e+02 0.00000000e+00 0.0000e+00 1.0000e+00
      -24  2    1    2    0    0 -1.2793809e+01 -8.3954553e+01 -1.1792566e+02 1.65987064e+02 8.02071978e+01 0.0000e+00 0.0000e+00
       23  2    1    2    0    0 +1.2793809e+01 +8.3954553e+01 +1.2752494e+03 1.28132832e+03 9.12640692e+01 0.0000e+00 0.0000e+00
       11  1    3    3    0    0 -1.2462673e+01 +1.3647422e+01 -2.6083861e+01 3.19677669e+01 0.00000000e+00 0.0000e+00 -1.0000e+00
      -12  1    3    3    0    0 -3.3113586e-01 -9.7601975e+01 -9.1841804e+01 1.34019297e+02 0.00000000e+00 0.0000e+00 1.0000e+00
        4  1    4    4  502    0 -1.8321803e+01 +9.0929609e+01 +9.3905973e+02 9.43629724e+02 0.00000000e+00 0.0000e+00 -1.0000e+00
       -4  1    4    4    0  502 +3.1115612e+01 -6.9750557e+00 +3.3618969e+02 3.37698598e+02 0.00000000e+00 0.0000e+00 1.0000e+00
<rwgt>
<wgt id='operator_1'> 4.55278761371e-06 </wgt>
<wgt id='mg_reweight_1'> 2.65941887458e-06 </wgt>
<wgt id='mg_reweight_2'> 8.68203803896e-06 </wgt>
</rwgt>
</event>

The above stems from a reweight_card that reads as follows:

launch --rwgt_name=operator_1
   set Dim6 1 100
   set Dim6 2 0
   set Dim6 3 0
   set Dim6 4 0
   set Dim6 5 0
launch
   set Dim6 1 0
   set Dim6 2 100
   set Dim6 3 0
   set Dim6 4 0
   set Dim6 5 0
launch
   set Dim6 1 0
   set Dim6 2 0
   set Dim6 3 100
   set Dim6 4 0
   set Dim6 5 0

The cross sections of the original model and those resulting from the new hypothesis are printed at the end of the run:

INFO: Original cross-section: 0.80086112072 +- 0.0025669959099 pb 
INFO: Computed cross-section: 
INFO: operator_1 : 5.0238030968 
INFO: mg_reweight_1 : 4.46724081967 
INFO: mg_reweight_2 : 0.790019392142 

LO Validation

Comparisons of the fully-inclusive cross sections. Proceed as follows: ./bin/madevent ./Cards/reweight_card.dat

p p > e+ e- cross-section

  1. The reweight_card is:
    launch
     set aewm1 100
    launch 
     set aewm1 200
    launch 
     set aewm1 300
    
  2. The associated cross sections are
    1. 1135.25 pb
    2. 1095.28 pb
    3. 1329.52 pb
  3. The cross sections computed directly with MG5_aMC@NLO are
    1. 1130 +- 2.815 pb
    2. 1098 +- 2.478 pb
    3. 1336 +- 2.777 pb

EWDIM6 Validation

input

  1. The model used for this validation is the EWDIM6 (See: http://arxiv.org/abs/arXiv:1205.4231). The 10k events to be reweighted were generated with the Standard Model (cross-section: 0.8008 ± 0.0026 pb)
  2. The reweight_card was:
    launch
       set Dim6 1 100
       set Dim6 2 0
       set Dim6 3 0
       set Dim6 4 0
       set Dim6 5 0
    launch
       set Dim6 1 10
       set Dim6 2 0
       set Dim6 3 0
       set Dim6 4 0
       set Dim6 5 0
    launch
       set Dim6 1 1
       set Dim6 2 0
       set Dim6 3 0
       set Dim6 4 0
       set Dim6 5 0
    launch
       set Dim6 1 0.1
       set Dim6 2 0
       set Dim6 3 0
       set Dim6 4 0
       set Dim6 5 0
    launch
       set Dim6 1 0.01
       set Dim6 2 0
       set Dim6 3 0
       set Dim6 4 0
       set Dim6 5 0
    

The same scan is performed for the three couplings (CWWW, CW, CB)

Results:

  1. For CWWW
Coupling value ($TeV^{-2}$) Reweighted cross-section (pb) MG5_aMC cross-section (pb) Status
0.01 0.800810008029 0.7973 ± 0.0023 OK
0.1 0.800903791291 0.799 ± 0.0026 OK
1 0.802209013071 0.7987 ± 0.0025 OK
10 0.85200014698 0.8584 ± 0.00092 OK
100 5.0238030968 6.09 ± 0.0082 FAIL (as expected): Requires too much stats
100 5.04763 6.09 ± 0.0082 FAIL (as expected): Requires too much stats (done with a sample of 100k events)

The FAIL entries indicate that the differential results for such value of the coupling are too different from the Standard Model ones, and discrepancies between the original and reweighted results are indeed normal in this case. Note that this behaviour can be expected simply on the basis of the total cross section results (the SM one and that associated with the new coupling differ by an order of magnitude). On the other hand, the inverse reweighting (that starts from the CWW=100 sample, and reweights to find back the SM) works properly; it returns 0.803341120226 pb for the total cross section.

Various differential distributions for the reweightings above are linked below. The dashed blue curve is the one produced by reweighting, while the solid black is the curve generated directly by MG5_aMC@NLO. All samples consist of 100k events.
Plots: 0.1 1 10 100

  1. For CW
Coupling value ($TeV^{-2}$) Reweight cross-section (pb) MG5_aMC cross-section (pb) Status
0.01 0.800798262059 0.7953 +- 0.002497 OK
0.1 0.801379445746 0.7988 ± 0.0023 OK
1 0.806872565125 0.8065 ± 0.0023 OK
10 0.889336417677 0.8832 ± 0.003 OK
100 4.46724081967 4.519 ± 0.015 FAIL (as expected)
100 4.44273 4.519 ± 0.015 FAIL (as expected) (done with a sample of 100k events)

Same comments as for the previous case.

  1. For CB
Coupling value ($TeV^{-2}$) Reweight cross-section (pb) MG5_aMC cross-section (pb) Status
0.01 0.800798262059 0.7977 ± 0.0027 OK
0.1 0.800782626532 0.7985 ± 0.0024 OK
1 0.800626859275 0.7981 +- 0.002365 OK
10 0.799127987884 0.7971 ± 0.0024 OK
100 0.790019392142 0.7852 ± 0.0026 OK
100 0.786698206995 0.7852 ± 0.0026 OK (done with a sample of 100k events)

This operator has a smaller impact on the cross section and distributions, and therefore even a large value of the coupling works fine.

Note:

  1. The cross section obtained for a 100k-event sample is 0.7989 ± 0.00087
  2. The statistical fluctuations of the original sample are reflected on the reweighted cross-section (as expected)

NLO Validation

All validation plots can be found in the following talk:

https://indico.cern.ch/event/458670/contribution/4/attachments/1203988/1753929/EW_reweighting_mattelaer.pdf

Attachments