Hyewon Kwon
Class: Text As Data
Hyewon Kwon
Dec 14, 2022
News Headlines Data
Summary
While growing negative perceptions about former President Donald Trump in 2017, the project expected to show that these negative views would appear on the politics category after the 2016 election than before. With 1000 train samples and 400 test samples of news headlines to test word sentiment score, the results confirms that more negative words appear in the politics category headlines after the 2016 election in a small scale difference.
News Category Dataset
The dataset provided by HuffPost has over 210k news headlines from 2012 to 2022. Because of the massive volume of the raw dataset, the sample dataset only focuses on the data from November 8, 2015, to November 8, 2017, a year before the 2016 election and a year after the election.
Results
After calculating the sentiment score by subtracting the number of negative words from the number of positive words and dividing the number by the total number of words, the outcome assigns words with a score between -5 and 5.
The result shows that 0.233 percent of negative words and 0.309 percent of positive words appeared in the headlines before the election. After the election, 0.244 percent of negative and 0.299 percent of positive words were used in the headlines. The use of negative words increased while the use of positive words decreased, even though the differences were not significant.