Particle-physics experiments are very complex, and a lot of thought and careful work is needed through the long process of recording the data, processing them, and then understanding what they tell you. Today I spent some time working on both the starting and finishing point of this process.
In our experiment, protons and antiprotons collide 7.5 million times per second. We need the collision rate to be that large because we are interested in very rare occurrences such as top-quark production; the more data, the better. But our detector, as sophisticated and fast as it is, can only record data from the collisions to computer disk at the rate of 75 times per second, which means that we must throw away the data from 99,999 out of every 100,000 collision events. It turns out that most of the 99,999 are not very interesting to us, and we can live without them. But any time you throw away data, you are taking the risk that you are throwing away something important, and in this particular case, we're throwing away 99.999 percent of our data before a human ever looks at them. We'd better know what we are doing.
If we were just to keep and toss events at random, we would never keep enough of the rare collisions that interest us. Instead we have developed specialized, high-speed electronics that can process a little bit of the data from a collision very quickly, decide if they are interesting, and then decide whether to keep the rest of them. I designed some of the electronics; my circuit boards measure the time at which a charged particle flew by particular points in our detector. This information is used to identify charged particles that carry a lot of energy, which means that they might be of interest to us, and therefore we should record the event for further study.
Designing the boards was one thing; making them work and keeping them working is another, and we've tended to run low on spare parts. This summer, one of our undergraduates has been trying to get some of our problem children working. Dave has made good progress, but some of the boards have been pretty tricky, so I spent some time working with him this morning. We check a board by sending test inputs to it, and making sure that the output electronic signals are what we expect them to be. If they aren't, we work through the board methodically, trying to find where things start going wrong. When fixing circuit boards, I tend to think about the scientific method; you must be careful about changing only one thing at a time, so that you can really understand the effect of each change. Dave and I had mixed luck today, as some of the problems seemed to come and go at random. All the scientific method in the world can't help you if you don't have a stable system to look at.
Meanwhile, I was still bothered by the problems that Nate was showing us yesterday. We definitely are having trouble understanding that data sample. But what does "understand" mean, anyway? To borrow an example from my thesis adviser (thanks, boss!): Imagine that your job is to determine how many diners in a restaurant are natural redheads. It sounds easy enough—go into the restaurant and count the number of people with red hair. But there are probably some true redheads who have gone bald or gray, and there may also be some apparent redheads who got their color out of a bottle. You might be undercounting or overcounting.
No, you're not allowed to ask people about their hair color—you'll have to develop some more sophisticated tools. Let's focus on the overcounting problem. How do you estimate the number of fake redheads? How many people who come to this restaurant dye their hair red? To answer that, you could spend some time at the beauty parlor and see how many people come through for a dye job and use that number to make an estimate of your "fake" rate. But is the beauty parlor a good model of the restaurant? What if people who get their hair done there don't like Ethiopian food?
This is an unrealistic analogy, but it shows the basic problem of experimental particle physics. You want to isolate a sample of collision events that include a particular kind of particle (redheads) for study, but you never have a perfectly pure sample, and you always miss some of the desired events. You have to compensate by estimating the fakes (the dyed redheads) and the losses (the bald ones), and your estimates are based on studies in other samples (the beauty parlor). Counting redheads in a restaurant is easy; it's finding the other samples and figuring out how to get good information out of them that takes the greatest effort.
I was musing on this problem with Dave, Nate's thesis adviser. (Not the same Dave as above; it's a common name in particle physics, for some reason.) Nate seems to have a problem accounting for the fake redheads in his sample. At first it looked to us like he was having trouble counting the customers at the beauty parlor, but now we are starting to think that it's worse—he may actually be counting the cats at the pet store. We spent some time looking at what other people have done with this problem in the past and coming up with ways to apply those solutions to our particular data sample. Now we have a list of things for Nate to work on, some of which might answer our questions—we'll have to try them and see what works! I'm going to Fermilab tomorrow, so I'll be able to tell Nate about this when I get there.