You know that person: live-tweeting their latest sneeze and bowl of chicken noodle soup from bed. No one likes being sick, but no one particularly likes getting a play-by-play of it, either. Except maybe your parents. And now, health researchers.
According to a new paper published in PLOS Currents this week, data from Twitter could play a crucial role in helping to predict and track flu outbreaks. The paper reports that scanning and analyzing tweets "significantly improves" flu forecasting and can reduce error seen with standard prediction models that use data from the Centers for Disease Control and Prevention by 17 to 30 percent. Twitter's data is also fast, while the CDC's can lag by a week or two, and then is often revised.
This isn't the first time that researchers have consulted big data from social media for public health purposes. Probably the most famous example to date is Google Flu Trends, an analytics tool from Google's charity arm that aims to predict the location and severity of flu outbreaks. Since Google Flu Trends launched in 2008, its accuracy has been questioned. It significantly overestimated the instance of flu in 2012-2013 after understimating the initial wave of swine flu in 2009. One main criticism of Google's model has been that it conflates keyword searches for the flu and flu symptoms with people actually having the flu.
Mark Dredze, an assistant research professor of computer science at Johns Hopkins University and author on the paper, says he and his colleagues have tried to be more careful in sorting through the data. He explains that they collect about 5 million tweets a day that contain health-related—not just flu-related—keywords. From those, they first separate out the tweets about the flu, and then filter out ones that seem specifically about having the flu. To do all this, the team uses language processing algorithms that look at all sorts of factors: the words in the tweet, the order of those words, and what role each takes in the sentence.
Dredze says the research doesn't indicate that Twitter is "definitively better" than Google Flu Trends, but that in conjunction with the CDC's data, it produces more accurate forecasts than the CDC's alone or models that use Google's information. He and his fellow researchers have made their data available to the CDC and are working with it to make the Twitter forecasts more accessible to everyone.
"The people who benefit most from this in the U.S. are the people at the state, city, and county levels," Dredze says. More accurate flu forecasts mean more accurate planning for things like hospital staffing and vaccine pushes. "There are a lot of different people who care about this," he adds.
So go ahead, tweet about your flu symptoms. You're officially doing something in the interest of public health.