On this day, Hillevi Hägglöf defended her M A thesis in language technology. It compared finding timely and topically salient terms in political discourse from social media, newsprint, and public records from the parliament. She compared Latent Dirichlet Allocation-based Topic Modelling with a simpler probabilistic method based on chi-square analysis of term occurrences. Her conclusion was that the LDA method is impractical and noisy. (This finding was confirmed by one of the originators of the method in conversation with myself during the course of Hillevi’s project). Her thesis also highlights the lack of useful evaluation methods for keyword extraction methods.
The intention with Hillevi’s project is to build a visualizer which will show how terminology varies and flows from channel to channel in public discourse. This will be done within a project funded by Internetstiftelsen.