yourHistory — Semantic linking for a personalized timeline of historic events

Title yourHistory — Semantic linking for a personalized timeline of historic events
Author David Graus, Maria-Hendrike Peetz, Daan Odijk, Ork de Rooij, Maarten de Rijke
Publication type Workshop Proceedings
Workshop name LinkedUp Challenge at Open Knowledge Conference (OKCon) 2013
Conference location Geneva, Switzerland
Abstract In this paper we present yourHistory: a Facebook application that aims to generate a tailor-made, personalized timeline of historic events, by matching a semantically enriched Facebook profile to a pool of candidate historic events extracted from DBPedia. Two aspects are central to our application: (i) semantic linking technologies backed by rich open web knowledge bases for generating semantically enriched user profiles, and (ii) semantic relatedness metrics for ranking historic events to user profiles. This paper describes the development of a Facebook application that aims to be engaging for users, whilst at the same time being a source for data that can be applied to evaluating or improving the application. We describe our Wikipedia-based semantic relatedness metric for event ranking, but also the restrictions and constraints concerning privacy-sensitive and ethical matters, around data storage and user consent. Finally, we reflect on how this type of user data can be applied for evaluating or improving both the semantic linking and event ranking methods in future work.
Full paper PDF [352.3 KB]

Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams

Title Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams
Author David Graus, Manos Tsagkias, Lars Buitinck, Maarten de Rijke
Publication type Full paper
Conference name 36th European Conference on Information Retrieval (ECIR ’14)
Conference location Amsterdam, The Netherlands
Abstract The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and recommendation of new content in real-time applications. We present an unsupervised method for generating pseudo-ground truth for training a named entity recognizer to specifically identify entities that will become concepts in a knowledge base in the setting of social streams. We show that our method is able to deal with missing labels, justifying the use of pseudo-ground truth generation in this task. Finally, we show how our method significantly outperforms a lexical-matching baseline, by leveraging strategies for sampling pseudo-ground truth based on entity confidence scores and textual quality of input documents.
Full paper PDF [256 KB]

Semantic Linking and Contextualization for Social Forensic Text Analysis

Title Semantic Linking and Contextualization for Social Forensic Text Analysis
Author Zhaochun Ren, David van Dijk, David Graus, Nina van der Knaap, Hans Henseler, Maarten de Rijke
Publication type Workshop Proceedings
Workshop name Workshop on Forensic Text Analysis (FORTAN)
Conference name European Intelligence and Security Informatics Conference (EISIC 2013)
Conference location Uppsala, Sweden
Abstract With the development of social media, forensic text analysis is becoming more and more challenging as forensic analysts have begun to include this information source in their practice. In this paper, we report on our recent work related to semantic search in e-discovery and propose the use of entity and topic extraction for social media text analysis. We first describe our approach for entity linking at the 2012 Text Analysis Conference Knowledge Base Population track and then detail the personalized tweets summarization task is introduced, where entity linking is used for semantically enriching information in a social media context.
Full paper PDF [204 KB]

Semantic Search in E-Discovery: An Interdisciplinary Approach

Title Semantic Search in E-Discovery: An Interdisciplinary Approach [link]
Author Graus, D.P., Ren, Z., van Dijk, D., van der Knaap, N., de Rijke, M. & Henseler, H.
Publication type Workshop Proceedings
Workshop name Workshop on Standards for Using Predictive Coding, Machine Learning, and Other Advanced Search and Review Methods in E-Discovery (DESI V Workshop)
Conference name ICAIL 2013
Conference location Rome, Italy
Abstract We propose an interdisciplinary approach to applying and evaluating semantic search in the e-discovery setting. By combining expertise from the fields of law and criminology with that of information retrieval and extraction, we move beyond “algorithm-centric” evaluation, towards evaluating the impact of semantic search in real search settings. We will approach this by collaboration in an interdisciplinary group of four PhD candidates, applying an iterative two-phase work cycle to four subprojects that run in parallel. The first phase we work individually. We determine the use and needs of search in e-discovery (subproject 1), and simultaneously explore and develop state-of-the-art semantic search approaches (subprojects 2–4). In the second phase we collaborate, designing user experiments to evaluate how and where semantic search can support the analysts’ search process. By repeating this cycle multiple times we gain specific and in-depth knowledge and propose solutions to specific challenges in search in e-discovery
Full paper PDF (144 KB)

Multilingual semantic linking for video streams: making “ideas worth sharing” more accessible

Title Multilingual Semantic Linking for Video Streams: Making “Ideas Worth Sharing” More Accessible
Author D. Odijk, E. Meij, D. Graus, and T. Kenter
Publication type Workshop Proceedings
Workshop name The 2nd International Workshop on Web of Linked Entities (WoLE2013)
Conference name WWW 2013
Conference location Rio de Janeiro, Brazil
Abstract This paper describes our submission to the Developers Challenge at WoLE2013, Doing Good by Linking Entities.” We present a fully automatic system which provides intelligent suggestions in the form of links to Wikipedia articles for video streams in multiple languages, based on the subtitles that accompany the visual content. The system is applied to online conference talks. In particular, we adapt a recently proposed semantic linking approach for streams of television broadcasts to facilitate generating contextual links while a TED talk is being viewed. TED is a highly popular global conference series covering many research domains; the publicly available talks have accumulated a total view count of over one billion at the time of writing. We exploit the multilinguality of Wikipedia and the TED subtitles to provide contextual suggestions in the language of the user watching a video. In this way, a vast source of educational and intellectual content is disclosed to a broad audience that might otherwise experience diculties interpreting it.
Full paper PDF

University of Amsterdam at TAC 2012

Title Context-Based Entity Linking – University Of Amsterdam At TAC 2012 [link]
Author Graus, D.P., Kenter, T.M., Bron, M.M., Meij, E.J., de Rijke, M.
Publication type Conference Proceedings
Conference name Text Analysis Conference 2012
Conference location Gaithersburg, MD
Abstract This paper describes our approach to the 2012 Text Analysis Conference (TAC) Knowledge Base Population (KBP) entity linking track. For this task, we turn to a state-of-the-art system for entity linking in microblog posts. Compared to the little context microblog posts provide, the documents in the TAC KBP track provide context of greater length and of a less noisy nature. In this paper, we adapt the entity linking system for microblog posts to the KBP task by extending it with approaches that explicitly rely on the query’s context. We show that incorporating novel features that leverage the context on the entity-level can lead to improved performance in the TAC KBP task.
Export BibTex
Full paper PDF (131.42 KB)