Predicting US 2016 Presidential Election

Project Description

In this post, we describe our methodologies on how we collect data at very large scale from news websites, blogs and Twitter channel to analyze the trends of US 2016 preseidential election.

Trump vs Clinton US 2016 Election
Fig 1. - US 2016 Electtion (source: ichef.bbci.co.uk)

Candidates Overal Social Presence

Let's first compare Trump Twitter presence with Clinton. Looking at two candidates number of followers and posted tweets, clearly Trump account is dominating Clinton's with a significant margin. See Table below highlighting each candidate stats recorded on June 20th, 2016:

Candidate Followers Posted Tweets No
Donald Trump 9.17M 32.3K
Hillary Clinton 7.01M 6080

Twitter Analysis - June 19th 2016

On June 19th, we have sampled more than 1.0M political tweets to calculate the number of conversations around each candidate. See table below for stats:

Candidate Mentions Perc.
Donald Trump 68%
Hillary Clinton 20%

The basic analysis on twitter shows that Sanders is losing the momentum on social media channel as he were only mentioned in 5.5% of tweets. There is pressure on Sanders from the Democrat party to drop out.

We also ran a basic sentiment analysis on collected tweets from June 19th to compute the popularity of each candidate in Twitter social media. See table below for each candidates positive and negative tweets percentages. Base on our analysis, Trump's tweets positive to negative ratio outperforms Clinton's.

Candidate Pos. Perc. Neg. Perc.
Donald Trump 44% 30%
Hillary Clinton 51% 38%

Twitter Analysis - July 9th 2016

On July 9th, we have sampled more than 1.6M political tweets. We sampled 1% of tweets for our analysis. First off, we calculated the conversations percentage for each candidate:

Candidate Mention (%)
Donald Trump 56%
Hillary Clinton 31.5%

Next, we ran the sentiment analysis on 160k tweets from July 9th to measure the popularity of each candidate. See table below for the stats which clearly shows Trump has been more popular than Clinton in Twitter social media.

Candidate Pos. Perc. Neg. Perc.
Donald Trump 48% 33%
Hillary Clinton 42% 48%

Last week Tuesday, an officer shot and killed Alton Sterling in Louisiana while he was being held down. Then on Wednesday, Facebook Live captured the death of Philando Castile in Minnesota, who was shot by police during a traffic stop. Finally, during a peaceful protest in Dallas on Thursday over those killings, a sniper killed multiple police officers and injured others.

These series of events might explain what is happening in social media. See following links for each candidate's reaction to Dallas incident: Trump: Dallas shootings have 'shaken the soul of our nation' and Clinton: Dallas shootings an 'absolutely horrific event'

Next we measured the top discussed topics on twitter on July 9th by computing top hashtags percentages:

Top Hashtags Percentage (%)
#Trump2016, #Trump, #Trumptrain 8.46%, 5%, 2.84%
#maga 2.53%
#Dallas 1.96%
#hillaryclinton 1.8%

reader should note this is an ongoing analysis.

Predicting US 2012 Election

In 2012, we analyzed US presidential election where we successfully predicted Obama winning Romney. You can find the details of that analysis here: Predicting US 2012 Election using Twitter Data and also here: Barack Obama or Mitt Romney: that's the question.