The 2016 Presidential Elections will standout in history as one of the most highly contested, racially-charged, controversial and often bizarre races. Keeping up with a fast-paced, competitive news cycle as part of the data team for Univsion's Fusion Media Group was no small challenge.
More than 22 candidates entered the fray. As the race narrowed, Bernie Sanders and Hillary Clinton took the lead on the Democratic side; Donald Trump rose through the Republican ranks. Media followed the race with an unprecedented intensity, relying more than ever on social media to drive traffic.
Here are some highlights of the data-driven stories we developed.
GOAL
Create timely, news-worthy, data-driven stories focused on the 2016 Presidential Elections
PROBLEM
Fusion Media Group had a 3-person data team, including Daniel McLauglin, Ross Goodwin and Sam Levine, and was looking to add a team member with journalism experience who could help define, package and write data stories. As the team evolved, Kate Stohr's role shifted to include more Python coding, data mining, partnership development and overall project management.
Responsible for:
Data mining
Data analysis
Data visualization
Bot development
Data scraping/API
Story production
Deadline management
TOOLS
Python, AWS, MongoDB
How 38 counties that represent a truly diverse America voted
Given Fusion's audience, we decided to focus our election night reporting on how places that represented the true diversity of America voted. Although the counties were selected and data visualizations were prepped in advance, the bulk of the work on this piece happened on the night as we processed the AP's real-time results to give our readers a view of how majority-minority counties voted compared with other places.
How much does your vote count?
Most people understand that the electoral college has a distorting effect on the power of an individual's vote. But fewer understand just how much and why. So, we built a tool that showed people the real power of their vote. It showed how the shifting landscape of the race along with voter turnout rates gave individual voters more or less influence in the outcome of the election. The tool was built by Fusion team member and data guru Daniel Mclaughlin. Kate Stohr's role was to help explain the complex algorithm behind the tool in a way that readers could easily understand.
How Trump’s troll army is cashing in on his campaign
We caught the 'fake news' trend early in the campaign. Calling upon 99 Antennas data researchers, we were able to scrape and label thousands of Facebook pages about the candidates as either pro or con. (We did this manually because political irony is something sentiment analysis doesn't handle well.) One theme stood out: Many of the most popular Trump-related pages either linked to clickbait (better known now as "fake news") or sites selling merchandise.
This bubble graph shows Facebook Graph API search results as of April 9, 2016 with a break out of the popularity of pages in support of Trump, against Trump and those simply selling goods. Pages owned by the candidate himself or unrelated to the candidate’s presidential campaign were removed from the dataset.
View Article (PDF)
View data visualization
Before you vote for a third-party candidate, here’s what you need to know
As the campaign headed into the fall homestretch, we spotted a rise in interest in third-party candidate's social media. Polling numbers for Libertarian Gary Johnson and Green Party candidate Jill Stein had risen sharply. Using historic Gallup data we looked to see how their polling numbers compared to those of past third-party candidates at the same period in the campaign cycle. Since 1988, third party candidates have averaged less than one tenth of a percent of the popular vote. As it turned out both Stein and Johnson were polling above average, but as with past third-party campaigns, interest wained as election day neared. Although both got more votes than they had in 2012, neither gained the 5% needed to to qualify their parties for federal funding in the next election.
Clinton’s post-convention surge is still not enough to make her as ‘like-able’ as Trump on Facebook
After the campaign winnowed, there was even more interest in the candidate's social media. To track it and catch shifts (or highlight the status quo) in the social media landscape, Kate Stohr built a slack bot that culled traffic to the candidate's social media accounts and automatically generated a number a data reports and plots. The bot saved time and allowed her to spot the rise of third-party candidates and other trending news. Used internally, the tool was an example of how data products can supplement ongoing reporting.
Here are all the Congresspeople who took NRA money and tweeted prayers for Orlando
In the wake of the Orlando night club shooting, politicians around the country tweeted their condolences to friends and families of the victims. Gun control activists pointed out that many of these same tweets came from members of Congress who accepted funds from the National Rifle Association. Using data from opensecrets.org and Carto's mapping API, we sought to bring this painful contradiction to light.
Warning: This election contains language some people may find offensive
Uncivil discourse in politics is nothing new, but we can now track it in real time. For this piece we partnered with MIT to access data from a new tool they developed called the Electome, which parsed Twitter's full firehose to identify election-related tweets. This combined with NLP tools they applied to the dataset allowed filtering and analysis of the political conversation on Twitter— and it wasn't necessarily polite. Kate Stohr's reporting looked at the types and topics of uncivil discourse as well as some of the reasons behind it.
Why Bernie Sanders loves Jimmy Kimmel and Hillary Clinton loves Ellen
This piece compared data from the Political TV Ad Archive with demographic data from Nielsen's. We were surprised to learn that the candidates were airing ads during the same popular shows for the most part. Then Kate Stohr did some reporting and learned about spot-ads, which allow you to buy ads in nationally syndicated shows but pick and chose the local markets where they air. A strategy that makes sense no matter who the candidate is vying to persuade.
Top 5 negative ads: What super PACs want you to know about Donald Trump
Before Super Tuesday Marco Rubio was the prime target for negative ads. When Bush dropped out, taking his well-funded super PAC with him, the landscape shifted. The super PACs took aim at Trump. By analyzing data from the Political TV Ad Archive we were able to see the shift before other news outlets. Opposing candidates’ own research showed Trump in the lead...
Donald Trump loves ‘poorly educated’ voters. Just who is he talking about?
This early piece helped elucidate why Donald Trump was excited about 'poorly educated' voters (Hint: They represent anywhere from 1 in 10 to 6 in 10 Americans depending on how you define 'poorly educated.')
Responding to the news cycle, we turned the piece around in a few hours. It was one of the few data explainers we did using available research. Most of our work focused on original datasets created by us or by partners.
When presidential candidates tag each other on Twitter, things get awkward
In this piece, we analyzed candidate tweets that have used the @[username] convention to mention other candidates since the campaign began. We wanted to see how often the candidates tagged each other on Twitter, whom they tag, and why.
At the start of the project Fusion's data team faced a number of challenges, including a lack of access to the site's production servers and to visualization tools. Stohr identified a number of work arounds, from creating infographic representations such as this one to using third-party service providers like Carto and Plot.ly.
MILESTONES
Start Date: January 2016
End Date: November 2016
Status: Complete
CLIENT
Fusion Media Group (Univision)
Data team: Kate Stohr, Daniel McLaughlin, Ross Goodwin, Sam Levine
Editors: Erin McClam, Kashmir Hill, Alexis Madrigal
Interactive: Rachel Schallom
Innovation: Sam Ford
Data Researchers:
Taniesha Broadfoot (99 Antennas)
Rachel Connolly Kwock (99 Antennas)
Grace Walker (99 Antennas)
Monica White (99 Antennas)
Giving a talk later this week on python and covering the 2016 Elections at Py-Ladies weekly meetup. #py-lovers. Should be fun. Here are the details:
Covering the 2016 Presidential Elections
Thursday, April 13, 2017
6:00 PM to 8:00 PM
(Location available upon registration)
The most popular Facebook posts, Tweets, and emoji from the presidential campaign this year
Even before the campaign began in earnest, the candidates—more than 22 major candidates in all —were campaigning on social media. Prior to joining the team, Fusion setup a database collecting the candidate's social media feeds. By the end of 2015, it had racked up 152,883 posts made by the candidates on Twitter, Facebook and Instagram. This first post was a 'retrospective' on the campaign so far — and it was only just beginning.