When it comes to social media, I was not an early adopter. I watched the grad students at the University of Toronto bang away on their Facebook pages for several years before I tentatively set up my own account. To this day, I still am not a particularly active Facebook user. When I learned about Twitter, it seemed even more odd. The idea that people were sending their thoughts out into the aether in 140 character bursts, and total strangers were reading them, was baffling to me.
Over the last few years, my opinion of Twitter has changed. Twitter is awesome. Because people are sending their thoughts out into the aether in 140 character bursts, and we can read them. Or search them, compile them, count them, and make graphs.
When enough people are on Twitter, and are engaged in the same live event, it is possible to collect all of this information and try to make sense of it. This season, I am going to do this with Texas football games. The Longhorn fan base is huge, and is very active on Twitter. With some programing, I am able to sample approximately 200 tweets per minute during Texas football games that are related to the game in some way. Over a three and a half hour game, this gives me roughly 40,000 tweets to analyze. These 40,000 tweets reflect the sentiments of the portion of the Texas fan base that is active on Twitter, and show how sentiments change as the game progresses.
Follow along after the jump to see what your fellow Longhorn fans were tweeting about during the Texas-Wyoming game.
My basic approach was to sample tweets that reference the Texas Longhorns in some way during a football game. After doing that, I ran keyword searches on the database of just under 70,000 tweets I collected between seven PM and midnight, on September 1. All times are listed in central time. I counted how many times each keyword occurred in each five minute interval during the course of the game; this forms the basic data set I used to create the plots below.
To capture the general feelings of the Longhorn fans who are tweeting during the game, I have created something called the "Negativity Index." I calculate this index by counting the number of tweets in each five minute interval that contain a list of words that are almost always used in a negative way. Most of these words are obscenities. I then rescaled the data so that the negativity index would fit on the same set of axes as the other data that I present below -- the plots below do not show an absolute count of negative tweets, but are meant to show how fan negativity changes over the course of the game.
The first plot below shows the negativity index as a function of time, along with the number of tweets that mention either David Ash or Case McCoy. Additionally, I have labeled a handful of key events throughout the game. The two largest spikes in the negativity index occurred when Wyoming scored touchdowns. The touchdown that Wyoming scored to start the fourth quarter produced the largest negative response. The 82 yard Wyoming touchdown pass in the first quarter also generated a large spike in negativity.
David Ash's touchdown pass to Jaxon Shipley in the first half of the game actually produced a five minute time period with a negativity index of zero. This event also resulted in a large number of tweets mentioning David Ash. The second highest rate of tweets about David Ash occurred when he fumbled the ball in the fourth quarter.
The blue squares in the graph above represent the tweets that mentioned Case McCoy. I was sort of expecting tweets mentioning McCoy to spike after Ash's fumble, but the Texas fans on Twitter are not as fickle as I anticipated. There were a few tweets about McCoy after Ash's fumble, but not very many. McCoy was hardly mentioned on Twitter until he entered the game near the end of the fourth quarter. McCoy entering the game also corresponded to a period of low negativity, as it was likely taken as a sign that the game was over.
The plot below shows the number of tweets about Malcolm Brown and Joe Bergeron during the game. As you would expect, there was a pretty large spike in tweets when the two running backs found the end zone. Additionally, Bergeron's 54 yard run brought a loud Twitter cheer. After Bergeron scored his fourth quarter touchdown, the negativity index briefly dropped to zero.
How you can help me
I am still getting the hang of what to look for in these data. I need help from you, our community of readers. What would be interesting to look for using the data taken from Twitter during Longhorn football games? Please use the comments section below to help me. Help me make this project as cool as possible. I cannot promise that I will be able to implement all of your suggestions, but I will read all of them and try to implement the ones that I can.