wiki:IntroGrid
Last modified 5 years ago Last modified on 05/02/12 09:34:17

Gridpacks : an introduction

General logic

MadGraph/MadEvent is highly optimized to run in parallel on computer clusters. The "single-diagrams-enhanched multichannel" integration technique makes it possible to split the phase space integration into small bits that can be evaluated independently. However, to get correct results it is mandatory that all these jobs are retrieved without errors. In principle such an architecture can also work efficiently over the LHC computing grid, which consist of many computer clusters around the world, apart from the fact that there is a larger overhead for sending jobs and, more importantly, the probability that some jobs might get "lost" cannot be neglected. Resending the lost jobs is an option, but not very efficient. We have therefore designed a special mode for MadGraph/MadEvent that creates a self-contained package that can be sent over the grid. This package, "gridpack.tar.gz", is optimized for a specific process with fixed model and run parameters and is supposed to run on a single machine. The package needs to be compiled on a grid compatible machine after which it can be send over the grid with as only inputs the requested number of events for that run and the random number seed. The events generated by a gridpack with a given random number seed are independent of events generated with a different random number seed. The events from a single job are distributed in phase space according to their physical expectation, so any number of jobs can savely be added or removed. Gridpacks for "SM candle" processes at the LHC (both for 14 !TeV as well as 10 !TeV) can be found on the MadGraph Samples wiki page.

How are the jobs organized and chosen?

As we all know the error in an event sample with N events scales like $\sqrt(N)$. Hence, also in the generation of events there is no need to go beyond this precision. So, when generating a small number of events, many channels that have only a small contribution to the total cross section can be ignored. This is what the running the gridpack does: Because you are only generating a very small number of events with a given gridpack, the error is large and many subprocesses can be ignored.

However, we cannot simply ignore all the smallest subprocesses and only evaluate the largest ones: we want to combine the events from many gridpack jobs to one big event file. The relative error in this event file should be much smaller, and therefore also events from subprocesses that have only a small contribution should be evaluated. To overcome this problem the contributions from each of the subprocesses is calculated with high precision in the creation of the gridpack. Then the gridpack jobs randomly include subprocesses based on their relative contributions to the total cross section.

Because we know the relative error for a given number of events for a single job, we can put a minimum to the events generated from one subprocess. We call this minimum number the "granularity". By default we set the granularity to the square root of the number of events. Therefore the minimum number of events generated from each subprocess is $\sqrt(N)$ and, hence the maximum number of subprocesses calculated per gridpack job is $\frac{N}{\sqrt(N)}$. Setting the granularity to the square root of the number of events makes sure that the smallest number of subprocesses needs to be calculated, but keeping the events calculated by a single gridpack job correct in the sense that they are distributed correctly over all the subprocesses and phase space within the expected uncertainty.

An example

Suppose you want to generate 1 million events. With the gridpack you could choose to do 200 runs in which each run generates N=5000 events. (Remember to use a different random number seed for each run). Each of the event samples returned by a single gridpack run has a physically distributed set of events, i.e. with an expected error of $\sqrt(5000)=71$. So the granularity can safely be set to the same number as giving it a lower value does not improve the error. Because the error is relatively large, there might be many important subprocesses that are not evaluated, but because the channels are chosen randomly, each gridpack run evaluates a different set of channels such that the total error on the 1 million events is only $\sqrt(10^6)=1000$.

In specific cases the granularity could be increased, e.g., if you know beforehand that you will produce a lot of events. In this example, where the total number of events will be a million, the uncertainty will be $\sqrt{10^6}=1000$. Hence the granularity could have been set to 1000 as this will generate at least 1000 events for each channel. Hence you'll never be off by more than 1000 events. (The value for the granularity can be set by passing it as the 3rd argument when executing the ./run.sh script). However, this should be used with great care, because in the case where not all the gridpack jobs can be retrieved, or if only a subset of the total number of events is analyzed, you are making an error. It is therefore highly recommended to not touch the default value for the granularity at leave the 3rd argument of the =./run.sh= script empty.

Parameters to be used for a small unbiased generation

Once the gridpack has been generated the number of events and the random number seed are the only input variables. They are given as arguments of the gridpack execution script. In principle there is no minimal number of events that can be generated with a single run of a gridpack, but we suggest to use something between 1000 and 10000. Of course, any number of gridpacks can be send simultaneously over the grid and the returned events will be unbiased towards each other if a different random number seed had been chosen for each one of them.

Difference between normal cluster running / gridpack running.

For the generation of events in normal cluster running all the possible contributions to a given processes are chopped into small parts and send as jobs simultaneously to a computer cluster. All these little jobs execute a part of the total contribution and generate events for this small part. Only after all the jobs (and their generated events) are retrieved and combined the final event sample is created. On the other hand, the gridpack should be executed on a single machine. In principle it will run all the little jobs described above in serial, except that the requested number of events is in general much smaller. Therefore only a small subset of this large number of jobs needs to be executed, chosen in such a way to have an unbiased sample. The number of subprocess evaluated is controlled by the granularity setting, see above. As the number of events per job is small and the error on an event sample scales like $\sqrt(N)$, where N is the number of events, a great optimization procedure can be included here. The major difference between normal cluster run and the gridpack running is therefore:

While for the a normal madgraph run events from all the subprocesses are included in the final event sample, for a gridpack run only a subset of the subprocesses are evaluated. This subset is randomly chosen according to their weight to the total cross section, keeping in mind that the error in N produced events scales like $\sqrt(N)$.

Technical details for setting up and running a gridpack

See here for more technical details.