Increasing predictive power by using Twitter as a complementary source

Master thesis data

Specialization: Algorithmics and Programming
Thesis advisors: Marta Arias Vicente/Argimiro Arratia
Orientation: Research
Student: Ramon Xuriguera Albareda
Thesis Description
 Typically, machine learning algorithms take data as input and yield a model with hopefully good
predictive power that adjusts well to the input. The information that these algorithms are allowed to
use is strictly restricted to the input data presented to it. There is no reason, though, why one
should restrict the input to this data alone; in today’s world there exist rich online repositories that
could provide useful external information and help building more accurate and predictive models.
The main goal of the proposed thesis is to test the viability of this idea in two different contexts:
- Stock Market prediction
- Box office / TV ratings prediction
We plan to study and design a module that can automatically extract relevant information from
Twitter so that it can be later used as new attributes to increase the predictive power of current
machine learning algorithms.
Our evaluation approach will consist of comparing the accuracy of state-of-the-art predicting
techniques in contrast to the one achieved by providing external data from Twitter to the same