Combining multiple signals for semanticizing tweets

Title Combining multiple signals for semanticizing tweets: University of Amsterdam at #Microposts2015
Author Cristina Gârbacea, Daan Odijk, David Graus, Isaac Sijaranamual, Maarten de Rijke
Publication type Workshop paper
Workshop name #Microposts2015 – 5th Workshop on Making Sense of Microposts
Conference name WWW ’15
Conference location Florence, Italy
Abstract In this paper we present an approach for extracting and linking entities from short and noisy microblog posts. We describe a diverse set of approaches based on the Semanticizer, an open-source entity linking framework developed at the University of Amsterdam, adapted to the task of the #Microposts2015 challenge. We consider alternatives for dealing with ambiguity that can help in the named entity extraction and linking processes. We retrieve entity candidates from multiple sources and process them in a four-step pipeline. Results show that we correctly manage to identify entity mentions (our best run attains an F1 score of 0.809 in terms of the strong mention match metric), but subsequent steps prove to be more challenging for our approach.
Paper PDF [92 KB]

Leave a Reply