thesis on clustering of twitter topics as a potential basis for election prediction
The thesis for my Bachelor’s Degree is downloadable –here–. It’s written in English and deals with clustering of Twitter topics, mainly.
Abstract - What we talk about when we talk about winners - Using clustering of Twitter topics as a basis for election prediction
Social media has over the years partly become a platform to express opinions and discuss current events. Within the field of Computer Science, Twitter has been used both as the basis for political analysis - for example using sentiment analysis to predict election results - and within the field of cluster analysis, where the question of how to best design and use an algorithm to extract topics from tweets has been studied. The ClusTop algorithm is specifically designed to cluster tweets based on topics. This paper aims to explore whether it is possible to (a) use an implementation of the ClusTop algorithm to identify topics connected to tweets about Trump and Clinton just before the American 2016 election, and (b) distinguish between the topics used in connection with a specific candidate in states where they won versus states where they lost the election. The problem is approached through the method of a controlled experiment where the data collected from Twitter is divided into groups and run through the ClusTop algorithm. The topics are then compared to draw tentative conclusions about their validity as a basis for election prediction. The study finds that it is indeed possible to adapt the ClusTop algorithm to use with tweets and geolocation to identify different topics, thus confirming the usefulness of the algorithm. In addition to this, the study confirms that manually examining the words used within the topics makes it possible to see differences between them. The work thereby places itself in the tradition of exploring how Twitter can be used for election prediction by being one of the first studies to look at clustering as a way of approaching the problem.