Inside the Lab That’s Quantifying Happiness
At the University of Vermont, mathematicians in the Computational Story Lab are reading your tweets and learning a lot about our collective well-being
New perk: Easily find new routes and hidden gems, upcoming running events, and more near you. Your weekly Local Running Newsletter has everything you need to lace up! .
In Mississippi, people tweet about cake and cookies an awful lot; in Colorado, it鈥檚 noodles. In Mississippi, the most-tweeted activity is eating; in Colorado, it鈥檚 running, skiing, hiking, snowboarding, and biking, in that order. In other words, the two states fall on opposite ends of the behavior spectrum. If you were to assign a caloric value to every food mentioned in every tweet by the citizens of the United States and a calories-burned value to every activity, and then totaled them up, you would find that Colorado tweets the best caloric ratio in the country and Mississippi the worst.
Sure, you鈥檇 be forgiven for doubting people鈥檚 honesty on Twitter. On those rare occasions when I destroy an entire pint of Ben and Jerry鈥檚, I most assuredly do not tweet about it. Likewise, I don鈥檛 reach for my phone every time I strap on a pair of skis.
And yet there鈥檚 this: Mississippi has the in the country and . Mississippi has the ; Colorado has the lowest. Mississippi has the in the country; Colorado is near the top. Perhaps we are being more honest on social media than we think. And perhaps social media has more to tell us about the state of the country than we realize.
That鈥檚 the proposition of Peter Dodds and Chris Danforth, who co-direct the University of Vermont鈥檚 , a warren of whiteboards and grad students in a handsome brick building near the shores of Lake Champlain. Dodds and Danforth are applied mathematicians, but they would make a pretty good comedy duo. When I stopped by the lab recently, both were in running clothes and cracking jokes. They have an abundance of curls between them and the wiry energy of chronic thinkers. They came to UVM in 2006 to start the , which crunches big numbers from big systems and looks for patterns. Out of that, they hatched the Computational Story Lab, which sifts through some of that public data to discern the stories we鈥檙e telling ourselves. 鈥淚t took us a while to come up with the name,鈥 Dodds told me as we shotgunned espresso and gazed into his MacBook. 鈥淲e were going to be the Department of Recreational Truth.鈥
This year, they teamed up with their PhD student Andy Reagan to launch the , an online tool that uses tweets to compute the calories in and calories out for every state. It鈥檚 no mere party trick; the Story Labbers believe the Lexicocalorimeter has important advantages over slower, more traditional methods of gathering health data. 鈥淲e don鈥檛 have to wait to look at statistics at the end of the year,鈥 Danforth says. 鈥淭his sort of data is available every day. We can tell if a public health campaign to invest in school nutrition is changing the way people talk about food or engage in activities.鈥 For example, what if larger than 16 ounces had gone through in New York? Using traditional surveys and hospital reports, it would have taken years to measure the impact. But if the Lexicocalorimeter was tuned finely enough to accurately measure the changes in soda habits by neighborhood, then public health officials could use it to target investments and adjust the campaign to reduce obesity far more effectively.
Playing around with the Lexicocalorimeter is illuminating and occasionally horrifying鈥攁 glimpse of the unvarnished American character.
Playing around with the Lexicocalorimeter is illuminating and occasionally horrifying鈥攁 glimpse of the unvarnished American character. Click on a state, and it displays the 200 words that made the biggest difference in that state鈥檚 calorie counts. (#48 in caloric balance), everybody鈥檚 eating chocolate, cookies, shrimp, and cake. Everybody鈥檚 eating, period. It鈥檚 one of the only activities frequently mentioned. In California (#12), they dance, run, hike, and bike, but they rarely sit or lie down. In my home state of , the food on the tip of everybody鈥檚 thumbs is bacon, which is probably a big part of why we consume slightly more calories than the average state. (In our defense, we also spend an inordinate amount of time tweeting about beets, broccoli, and bananas.) Despite that, we are fairly exercise obsessed, with鈥攜ou guessed it鈥攕kiing leading the way. () All this gives us the third-best caloric ratio in the nation, behind Colorado and Wyoming. And sure enough, the health numbers match: We have some of the lowest rates of diabetes and obesity and one of the highest life expectancies.
In general, all states are more alike than we might like to believe. 鈥淲atching TV or movie鈥 is the most-tweeted activity for every single state in the union, and 鈥減izza鈥 is the most-tweeted food for every state except Wyoming (cookies) and Mississippi (ice cream). Where a state鈥檚 individual character really shines is in the foods and activities mentioned far more or less than average. Texas (#36) can鈥檛 stop tweeting about doughnuts; Maine (#5) is hooked on lobster. In activities, the mountain states do a lot of running, the South is a solid block of eating, New Jersey is all about 鈥済etting my nails done,鈥 and Delaware distinguishes itself with 鈥渢alking on the phone.鈥
Dodds and Danforth acknowledge their methods are not perfect. The butter on the lobster doesn鈥檛 get counted. There鈥檚 no way of calculating if somebody ran one mile or ten. But when you鈥檙e talking tens of millions of tweets per day over the full range of demographics, the inaccuracies even out鈥攁t least as much as they do compared to the other, equally flawed ways of measuring society鈥檚 eating and exercising habits. As Dodds points out, the numbers speak for themselves: The Lexicocalorimeter correlates extremely well with rates of diabetes and obesity. 鈥淭he ridiculous thing about this,鈥 he says, 鈥渋s that it works.鈥
We live in strange times. 鈥淧eople leave so much of their id on the web,鈥 Danforth marveled to me, 鈥渁nd they share it openly. That鈥檚 enabled a whole host of new instruments to try to understand what鈥檚 predictable about our behavior. And it turns out a lot is. As much as we think we鈥檙e really complex, people have very structured ways of behaving. The way we move around the earth is very predictable. The way we use language is very predictable.鈥
For example, Barack Obama鈥檚 approval ratings over his presidency strongly correlate with the sentiment of tweets about him three months in advance of the approval polls. In other words, if you鈥檇 been a savvy politico with a tool for measuring tweets, you鈥檇 have had valuable intel months ahead of anyone else. 鈥淚t鈥檚 an amazing time in social science because of the data available,鈥 Dodds says. 鈥淚t鈥檚 opened up a window that we absolutely did not have access to before.鈥

That鈥檚 the idea behind the UVM team鈥檚 , which surveys the country鈥檚 tweets each day and calculates a happiness score for each. The team had people rate 10,000 words on a happiness scale of one (sad) to nine (happy). Most words are neutral. The are 鈥渓aughter鈥 (8.50), 鈥渉appiness鈥 (8.44), and 鈥渓ove鈥 (8.42). 鈥淗ahaha鈥 gets a 7.94, putting it a bit higher than 鈥渒isses鈥 (7.74). The biggest negatives are 鈥渢errorist鈥 (1.30), 鈥渟uicide鈥 (1.30), and 鈥渞ape鈥 (1.44). 鈥淪hit鈥 gets a 2.50, 鈥渂itch鈥 a 3.14, and 鈥渇uck鈥 a surprisingly respectable 4.14. Fuck yeah! 鈥淪wearing is really important,鈥 Dodds says.
All this adds up to a tracking the nation鈥檚 mood from 2009 (the fledging of Twitter) to the present. 鈥淥ne of our goals was to provide a snapshot of the public鈥檚 response to something,鈥 Danforth explained, 鈥渢he texture of the day.鈥 Most regular days fall into a narrow band with an average happiness level around six, though Saturdays are consistently the happiest days of the week and Tuesdays the grumpiest. There鈥檚 also a daily pattern, with happiness levels soaring around 5 and 6 a.m., when we鈥檙e all newly optimistic about the day, and then plunging throughout the morning and evening as reality sets in, reaching a trough of despair around 11 p.m. 鈥淭he wheels kind of come off,鈥 says Dodds. 鈥淲e call it the daily unraveling of the human mind.鈥
You can also . The happiest state is鈥攗nsurprisingly鈥擧awaii. The bottom dwellers are, once again, Mississippi and Louisiana, though Delaware gets a surprising bronze for melancholy. The West is happy, both coastal and mountains, while the South and Midwest are unhappy. Only Tennessee bucks the trend, an island of smiles in a sea of Southern gloom.
The happiest day of the year is always Christmas, when the Hedonometer spikes as words like 鈥淐hristmas,鈥 鈥渉appy,鈥 鈥渇amily,鈥 and 鈥渓ove鈥 flood the ether. The five unhappiest days since 2009: the shooting at Sandy Hook Elementary School, the Boston Marathon bombings, the Orlando nightclub attack, the shooting of Dallas police officers, and the election of Donald Trump.
National happiness is not consistent. We were quite happy from 2009 to 2011, despite the Great Recession. Then our mood darkened from 2011 to 2014, but we came out of it: The Hedonometer surged! The year 2015 was a relatively joyful one, and the good feelings kept going in 2016鈥攗ntil the election took over. Since then, signs have been growing that something terrible is happening to the American psyche. We鈥檝e never been so erratic, with the normally smooth blips of the Hedonometer starting to twitch like someone failing a lie-detector test. And as of this writing, we鈥檙e sinking into an unprecedented malaise.
That is, if you believe the Hedonometer. On the face of it, measuring something as intangible as happiness sounds absurd. Yet, as with the Lexicocalorimeter, the Hedonometer matches 鈥渞eal world鈥 measures such as the (which polls people on things like life satisfaction and personal health) and the (which surveys rates of homicides, violent crime, and incarceration).
History is full of concepts鈥攆rom longitude to time鈥攖hat seemed imprecise until the right instrument came along. Even temperature, which to us seems objective, was considered unmeasurable for centuries. 鈥淧eople thought you couldn鈥檛 do it,鈥 Dodds says. 鈥淏ecause it鈥檚 too multifaceted, and the first thermometers were awful.鈥 But eventually our instruments improved.
History is full of concepts鈥攆rom longitude to time鈥攖hat seemed imprecise until the right instrument came along.
Dodds and Danforth see no reason why happiness can鈥檛 also be quantifiable. 鈥淲e鈥檙e carrying around these phones that are sensing so much of our behavior,鈥 says Danforth. 鈥淭one of voice. Who we talk to. The types of words we use. We鈥檙e trying to push on a few areas and see what鈥檚 predictable, both on the population scale and for individuals.鈥 And what they鈥檙e finding is that our phones have become surprisingly good instruments for taking our emotional temperatures. 鈥淐an we tell you鈥檙e about to experience an episode of depression based on your social media behavior? Maybe your friends can鈥檛 see it, maybe you don鈥檛 even realize it, but you鈥檝e started to communicate with a smaller group socially, or you鈥檙e not moving around the earth as much.鈥
By analyzing the tweets of both depressed and healthy individuals, the Story Lab has developed algorithms that can accurately identify depression months before actual diagnoses by mental health practitioners. They鈥檝e even done it with Instagram, discovering that depressed individuals are more likely to post photos that are bluer, grayer, and darker. Their method outperformed professional practitioners at identifying previously undiagnosed depression. The lab is now partnering with a psychiatrist at UVM who hopes to use the algorithm to search the social media history of ER visitors (who give their consent) to predict suicidal behavior.
One of the clearest signs that the Hedonometer is on to something is how well it works with media besides Twitter. The Story Lab has analyzed the words in 10,000 books and 1,000 movie scripts, and it accurately sorts the feel-goods from the nihilists. , powered by words like love, smiles, wedding, beautiful, and, yes, sex. At the bottom of the list we have grim fare like Commando, Day of the Dead, The Bourne Ultimatum, and Omega Man.
Yet for grimness, none of those can touch 国产吃瓜黑料鈥檚 masterpiece of masochism, 鈥Bury My Pride at Wounded Knees,鈥 by Mark Jenkins, about competing in the 2010 Death Race. I asked Reagan, Dodds, and Danforth to take the emotional temperature of 49 , which run the gamut from Steve Rinella鈥檚 joyful paean to Argentinian steak (鈥淢e, Myself, and Ribeye鈥) to Jenkins鈥 mudfest, which begins 鈥淚 unintentionally pitchfork a clod of manure into my mouth鈥 and goes downhill from there.

Beyond a piece鈥檚 overall happiness score, the Hedonometer allows us to chart its emotional journey. Here鈥檚 鈥,鈥 perhaps the ultimate 国产吃瓜黑料 classic, which starts off happy (Everest adventure!), plunges about a third of the way through the story as Krakauer reaches the perilous Lhotse Face (鈥淚t was here that we had our first encounter with death on the mountain鈥), soars at the halfway point (summit!), and then tanks far and fast as storms move in, mistakes are made, and people die.
Not only does the fall-rise-fall emotional arc of 鈥淚nto Thin Air鈥 nicely mirror the Himalayas, but it also happens to be a good example of a classic narrative arc. Riffing off on the shapes of archetypal stories, the Story Labbers came up with that stories tend to follow: Rags-to-Riches (rise), Tragedy (fall), Man-in-a-Hole (fall-rise), Icarus (rise-fall), Cinderella (rise-fall-rise), and Oedipus (fall-rise-fall). Encouragingly, an analysis of the bestseller lists found that the more complex narratives (Cinderella and Oedipus) tend to sell better than the simpler ones.
This fall, the UVM team will be working on teasing even more stories out of the data we share: Can financial crashes be predicted ahead of time? When does fake news trump real news? How does a society settle on the story it tells about itself? The questions are far from trivial. 鈥淗umans are storytelling organisms,鈥 Dodds says. It鈥檚 how we learn who we are. As the Story Lab gets even better at finding the signals in our noise, let鈥檚 hope we like what we discover.