Fork me on GitHub

Opened 10 years ago

Closed 10 years ago

#362 closed Bug (fixed)

Fake Photons (boosted pi0)

Reported by: Keith Pedersen Owned by: Pavel Demin <pavel.demin@…>
Priority: minor Milestone:
Component: Delphes code Version: Delphes 3
Keywords: Isolation fake photon jet Cc:

Description

Hello,

I believe I have found a bug in the Isolation module (depending on how Candidate::Overlaps was intended to be used inside Isolation).

First, let me explain my analysis. I am looking at (p p > b b~) in the boosted regime (i.e. ptb > 500 GeV) with no pile-up. I've been getting a rather high rate of EFlowPhotons (~6% of events), which are certainly "fake" (i.e. jet related) since there are no photons in my ME. Most of these photons are rather energetic (50 - 500 GeV). I did some fiddling to track down the proximate source, and discovered they were highly boosted pi0. The smoking gun was making pi0 invisible in the Calorimeter:

add EnergyFraction {111} {0.0 0.0} 

With no pi0, the photons completely vanished.

The thing is, since these EFlowPhotons are part of the EFlowMerger, they are allowed to cluster into jets, and they often pick up a significant amount of additional pT. Often the pT of the resulting jet is more than double the photon's pT. Since I am using the same deltaR parameter (0.5) for Isolation and FastJetFinder (using anti-kt, whose jets are rather circular), I found it strange that these photons were getting isolated and at the same time clustering into jets much larger than themselves (given PTRatioMax of 0.1).

So I dug into the Calorimeter module (which generates my IsolationInputArray: tracks, EFlowPhotons and EFlowNeutralHadrons) and the Isolation module, and spent some time investigating their inner workings.

The problem manifests itself in the Isolation double-loop. The outer loop runs over all isolation candidate (e.g. EFlowPhotons or muons). The inner loop runs over all isolation particles (i.e. those which could de-isolate the candidate, in my case the particles in the EFlowMerger). If an isolation particle is within a cone of fDeltaRMax around the candidate, its pT is added to sum (as presented below):

while((isolation = static_cast<Candidate*>(itIsolationArray.Next())))
{
  const TLorentzVector &isolationMomentum = isolation->Momentum;
		
  if(candidateMomentum.DeltaR(isolationMomentum) <= fDeltaRMax &&
   !candidate->Overlaps(isolation))
  {
    sum += isolationMomentum.Pt();
    ++counter;
  }
}

The root of the problem is the call:

candidate->Overlaps(isolation)

If this call returns true, the pT of isolation is not added to sum.

I can only assume (since the code is not commented), but it appears that Overlaps is used to make sure that the pT of the candidate itself is not added to sum. To do this, Overlaps ensures that candidate and isolation aren't the same (via their uniqueID) AND that they don't share any particles with the same uniqueID at ANY level of their fArray.

The problem is that, using EFlow, each Calorimeter tower is associated with an EFlowPhoton and an EFlowNeutralHadron, along with any tracks which strike the tower. As it is currently implemented, all of these objects are guaranteed to Overlap as longs as they are associated with the same tower (I will explain below). So, when you're isolating EFlowPhotons, any hadronic/track energy in the same tower is going to be excluded from the Isolation sum. When you're dealing with highly boosted jets, with highly collimated central cores, you're going to exclude a lot of pT from sum, thus isolating a lot more EFlowPhotons.

The reason that the three classes of EFlow particles from the same tower are guaranteed to Overlap is simple. In Calorimeter, each tower is initially filled as a single object (fTower). All particles that strike the tower are stored in fTower->fArray (in their perfect, un-smeared form). When it comes time to subtract track energy and create EFlowPhotons and EFLowNeutralHadrons, both EFlow objects are created via fTower->Clone(). This means that they now share the same fArray, so they will necessarily Overlap each other. Also, each track keeps its perfect, un-smeared particle in its own fArray and, since the perfect particle struck the tower and got added to the tower's fArray, all tracks will Overlap the EFlow objects of the same tower.

In order to test this hypothesis, I implemented a quick-and-dirty fix. I know that my IsolationInputArray doesn't contain any redundant energy (its just tracks, EFlowPhotons and EFlowHadrons, straight from the EFlowMerger). I also know that every Isolation candidate is replicated in my IsolationInputArray (but, since there's no redundancies, only once). So, I simply disabled Overlaps() and assumed that each candidate had its own pT added to sum exactly once, then subtracted its own pT after the inner-loop:

    while((isolation = static_cast<Candidate*>(itIsolationArray.Next())))
    {
      const TLorentzVector &isolationMomentum = isolation->Momentum;
		
		if(candidateMomentum.DeltaR(isolationMomentum) <= fDeltaRMax /*&&
         !candidate->Overlaps(isolation)*/)
      {
        sum += isolationMomentum.Pt();
        ++counter;
      }
    }
    // Subtract the candidate's own pT
    sum -= candidateMomentum.Pt();

When I re-ran my events, the number of events which contained photons dropped from ~6% to ~1.8%. Additionally, those photons still present almost never got clustered into jets (and when they did, they only sucked up a small percent of their own energy). Thus, "isolated" is now a rather appropriate label for these pi0 photons.

Due to the lack of comments in the code, I do have some additional questions: Was Overlaps used in this way intentionally (to neglect energy within the same tower), or was this an unforeseen side-effect? I know that in this paper, the CMS collaboration describes throwing out a central core of tracks (deltaR < .04) to prevent casting false negatives on photons which pre-showered to e+e-, but this effect is not simulated in Delphes.

Furthermore, I would like to add that Overlaps is a rather expensive call. When I profiled a Pythia8/Delphes combination recently, I found that 70% of the runtime was Pythia8, 16% was Isolation (of which 12% was Overlaps) and 7.5% was FastJetFinder (anti-kt, with no jet area). My recommendation (assuming this is a bug) would be a "less safe" Isolation module (under a different name) where the user must be more careful about what they put in. This module would have no calls to Overlaps (or, at the very most, a "first-order" Overlaps which does not dive into either Candidate's fArray).

I hope I have not mis-categorized this as a bug,
Keith

Attachments (5)

Delphes_Isolation_problem.JPG (1.0 MB ) - added by Keith Pedersen 10 years ago.
A pictorial representation of the problem
Delphes_Isolation_problem_thumb.png (401.7 KB ) - added by Keith Pedersen 10 years ago.
A pictorial representation of the problem (thumbnail)
delphes_card_CMS_LHCO.tcl (17.1 KB ) - added by Keith Pedersen 10 years ago.
KDP's Delphes card
matching.ini (4.3 KB ) - added by Keith Pedersen 10 years ago.
KDP's pythia card
unweighted_events.lhe.gz (16.3 MB ) - added by Keith Pedersen 10 years ago.
Boosted events using matching (p p > Z' > b b~ + jets)

Change History (15)

comment:1 by Pavel Demin, 10 years ago

Dear Keith,

Thank you for your detailed analysis of this problem!

I agree that in case of many particles hitting the same tower, the Overlaps method becomes expensive.

We've already seen a similar problem with the Overlaps method in the UniqueObjectFinder module when it's used with highly overlapped input arrays.

Rather than adding a new module, I could add a parameter that will allow user to select "safe" or "less safe" algorithm.

Could, you please send me your Delphes and Pythia configuration files? This will allow me to reproduce this problem.

I've just noticed that InputArray in PhotonEfficiency was at some point changed from

  set InputArray Calorimeter/photons 

to

  set InputArray Calorimeter/eflowPhotons

I'm not yet sure but it could be the real source of this bug.

Initial idea was that the Calorimeter module produces photon candidates that don't overlap with any other object and the problem that you describe should not happen.

Could you, please check if replacing Calorimeter/eflowPhotons with Calorimeter/photons in the PhotonEfficiency module solves this problem?

Regards,

Pavel

Last edited 10 years ago by Pavel Demin (previous) (diff)

by Keith Pedersen, 10 years ago

A pictorial representation of the problem

by Keith Pedersen, 10 years ago

A pictorial representation of the problem (thumbnail)

comment:2 by Keith Pedersen, 10 years ago

Pavel,

Thank you for your quick response.

Using Calorimeter/photons will run into the same problem because of the common content in the fArray. I've summarized my original post with a pseudo-flowchart (a reduced size thumbnail is inserted below, check attachments for original picture).

https://cp3.irmp.ucl.ac.be/projects/delphes/raw-attachment/ticket/362/Delphes_Isolation_problem_thumb.png

I've also attached my LHE, pythia config file, and Delphes configuration card. A few comments about my Delphes card (in case it looks a little weird):

*I use a DelphesModule of my own design (LHCOWriterDirect) which skips the Root file and writes directly to LHCO, as my analysis code is designed for LHCO input. You will have to put a TreeWriter back into my Delphes card if you intend to write anything out.

*I removed the second round of particle efficiency since the track/identity information is already used for EFlow (i.e. should the track energy be removed from the ECAL or HCAL?), so subsequently removing my awareness of a particle doesn't seem consistent (to me).

*I don't use UniqueObjectFinder because I don't want to entirely kill a jet if it contains a unique object. My LHCOWriterDirect module subtracts unique objects from jets and writes out both (which was crucial to diagnosing the problem).

*Since my analysis involves a detailed study of muons in jets, I don't bother isolating muons, I just write them all out (subtracting them from jets if absorbed).

*My main(), when transferring the Pythia even record to Delphes, only adds partons to Delphes/partons directly before hadronization (i.e. after all fragmentation). I cluster these into jets so I can get a sense of the flavor/nature of each jet. The ME level partons are placed into another list (which I'm not currently using).

by Keith Pedersen, 10 years ago

Attachment: delphes_card_CMS_LHCO.tcl added

KDP's Delphes card

by Keith Pedersen, 10 years ago

Attachment: matching.ini added

KDP's pythia card

by Keith Pedersen, 10 years ago

Attachment: unweighted_events.lhe.gz added

Boosted events using matching (p p > Z' > b b~ + jets)

comment:3 by Pavel Demin, 10 years ago

Thank you for all the files. I can now run DelphesPythia8 with your configuration files.

I've also noticed that you set photon efficiency to 1.0 with the following expression in the PhotonEfficiency module:

set EfficiencyFormula {1.}

This way this module outputs ~200 photons per event.

Setting efficiency to 0 for photons with pT < 1GeV/c removes 99% of all photons:

set EfficiencyFormula {(pt <= 1.0) * (0.) + (pt > 1.0) * (1.)}

Then PhotonIsolation removes almost all of them.

I've also checked that switching to Calorimeter/photons actually helps.

The objects in this array are preselected with the following expression:

if(fTowerPhotonHits > 0 && fTowerTrackHits == 0)
{
  fPhotonOutputArray->Add(fTower);
}

So, there should not be any track touching these objects.

Objects from Calorimeter/eflowPhotons are not preselected and they can indeed have an associated track.

I think that using Calorimeter/eflowPhotons as input for the PhotonEfficiency module is a bug and it should be fixed.

comment:4 by Keith Pedersen, 10 years ago

Pavel,

Thanks for the recommendations for PhotonEfficiency; I never thought of using it as a cut :)

Sorry to keep harping on this point:

I agree that tracks won't touch the (Calorimeter/photons) objects, so they will have no conflict with tracks. However, each non-eFlow-photon in (Calorimeter/photons) will still have the same fArray as the (Calorimeter/eFlowNeutralHadrons) from the same tower. So if (EFlowMerger/eFlow) is used as the IsolationInputArray, then any "neutral hadrons" from the same tower (including missed tracks) will still Overlap the non-eFlow-photon and won't contribute to the pTsum in Isolation.

Of course, these neutral hadrons are actually still inside the non-eFlow-photon, but Isolation doesn't look at them (or calculate Had/Em or anything like that). And if they're labelled as isolated and also clustered into a jet, anyone using UniqueObjectFinder will veto the jet, which won't get written to the Tree, precluding any chance to de-isolate the photon in post and recover the jet.

IMHO, I think working with eFlowPhotons is much preferable when studying heavy-flavor jets because EFlow stips out all the (sometimes highly energetic) electrons from semi-leptonic decays. Thus, my vote is for the "safe" and "lessSafe" switch you mentioned in your first response (I'm assuming "lessSafe" just checks that the uniqueID aren't the same, or something similar).

Cheers,
Keith

comment:5 by Pavel Demin, 10 years ago

Dear Keith,

With my colleagues, we've discussed possible solutions and it's the very last solution (a "first-order" Overlaps) that you initially proposed that seems the most universal.

So, I'm planning to replace the following lines in the Isolation module

      if(candidateMomentum.DeltaR(isolationMomentum) <= fDeltaRMax &&
         !candidate->Overlaps(isolation))

with

      if(candidateMomentum.DeltaR(isolationMomentum) <= fDeltaRMax &&
         candidate->GetUniqueID() != isolation->GetUniqueID())

What do you think, would it be enough to solve this problem?

Regards,

Pavel

comment:6 by Pavel Demin <pavel.demin@…>, 10 years ago

Owner: set to Pavel Demin <pavel.demin@…>
Resolution: fixed
Status: newclosed

In df3503347d15326a649d212b257709113f592736/git:

replace call to Overlaps with UniqueID comparison (close #362)

comment:7 by Chase, 10 years ago

Hi,
The fix posted in commit df35033 appears to break "normal" photons. I don't know enough about it (nor have enough time) to do such a detailed analysis. But basically I've just generated some gamma-jet samples w/ MadGraph. I'm using Dephes 3.2.0 (separately installed, since MG still distributes v3.1.2). I found that after Delphes output, there were consistently *zero* photons in my samples! After some work I traced it to the (Photon)Isolation module. Since the only change since v3.1.2 was this one, I tried reverting to

!candidate->Overlaps(isolation)

and now it works as expected. I'm guessing maybe it's related to the fact that the isolation input is from Calorimeter/eflowPhotons, while the candidates are Calorimeter/photons, so the UniqueID's don't match up?

I do not think I'm doing anything exotic, but I'm puzzled as to how something that breaks nearly all photons in Delphes could go unnoticed for 6 months. I'm using the default delphes_card_ATLAS config and pythia (as run by MadGraph, pythia-pgs v2.4.3). Let me know if you need an LHE file to test on, but as far as I can tell any event with a real photon as a final-state parton will show this behavior.

comment:8 by Keith Pedersen, 10 years ago

Hello,

I've been thinking about how we can make a fix that accommodates everyone's needs. Basically, we want EFlow candidates (EFPhotons, EFNeutHad, and tracks) to NOT overlap each other, even though their fArrays overlap (by construction). However, we want EFPhotons to overlap CalPhotons (although EFNeutHad should probably NOT overlap CalPhotons).

After giving this some though, I think I have found a suitable solution.

A CalPhoton is only produced when (1) a truth-level photon or electron strikes the tower, ostensibly producing a distinct EM shower in the ECal (although there are no energy cuts for this EM shower particle) AND (2) no reconstructed tracks strike the tower.

Because of condition 2, there will be no track energy to subtract from the EF objects. This means that the EF objects should be the CalPhoton split into its ECal and HCal components (with the same Position, Edges, etc.).

Thus, the simplest way to solve this problem is, when both CalPhoton conditions are met, add the EFPhoton to Calorimeter/photons and the EFNeutHad to Calorimeter/towers. This immediately addresses what seems to me to be another problem with CalPhotons -- that they also contain the neutral hadronic energy.

Of course, the user will have to be careful how the merge their arrays, but this has always been the case.

I will code this up in my fork and submit it as a Pull request, for approval by the Delphes team.

Keith

comment:9 by Keith Pedersen, 10 years ago

Resolution: fixed
Status: closedreopened

comment:10 by Michele Selvaggi <michele.selvaggi@…>, 10 years ago

Resolution: fixed
Status: reopenedclosed

In 3cac201f880f085313068627a8b275eab384601e/git:

fixed input collection for photon isolation (close #362, close #657)

Note: See TracTickets for help on using tickets.