“The birth of collective memories” published in JASIST!

๐Ÿ“… February 5, 2018 ๐Ÿ• 08:06 ๐Ÿท Papers and Research

The journal paper “The birth of collective memories: Analyzing emerging entities in text streams” I wrote with Daan Odijk and Maarten de Rijke is now (finally) published at JASIST! It is published under OpenAccess/CC BY 4.0 and available in “early view” (published before it’s published) in the Wiley Online Library. Click on the image below to access it:


The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams

๐Ÿ“… December 11, 2017 ๐Ÿ• 16:15 ๐Ÿท Papers

Our paper “The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams” was accepted for publication at JASIST (the Journal of the Association for Information Science and Technology)! Grab a pre-print here:

  • [PDF] D. Graus, D. Odijk, and M. de Rijke, “The birth of collective memories: analyzing emerging entities in text streams,” Journal of the association for information science and technology, 2018.
    title={The birth of collective memories: Analyzing emerging entities in text streams},
    author={Graus, David and Odijk, Daan and de Rijke, Maarten},
    journal={Journal of the Association for Information Science and Technology},

This paper is is:
1. My first journal paper
2. Based on Chapter 3 of my PhD thesis “Entities of Interest — Discovery in Digital Traces
3. The first collabo on a paper (on paper) between the FD Mediagroep, Blendle, and the UvA
4. The tombstone on my academic career! (?)

In this paper we study news and social media streams spanning over 18 months, and comprising over 579 million documents, and analyze ’emergence patterns’ of entities, i.e., how a real-world entity (such as a person, organization, product, etc.) appears in these documents in the timespan between the entityโ€™s first mention in online text streams, and when an article devoted to the entity is subsequently added to Wikipedia.


Financial News Mining @ FD Mediagroep/Company.info Slidedeck

๐Ÿ“… November 24, 2017 ๐Ÿ• 11:49 ๐Ÿท Blog

Here are the slides of a talk I gave at the Data Science Northeast Netherlands Meetup, where I detail the custom in-house entity linking framework, sentiment analysis, and entity salience scoring model we developed for Company.infoย (part of FD Mediagroep), in addition to showing some example applications of our corpus of news articles linked to organization profiles.

I’m sharing it here because I think it’s cool, since it’s one of the first project I’ve done at Company.info! Gives you some idea of what we’re working on..

In “Denktank” on algorithms, behavioral analysis, and personalization

๐Ÿ“… November 13, 2017 ๐Ÿ• 11:54 ๐Ÿท Media

My debut on national TV ;-)! Denktank is a TV show where youngsters explore and think about how (current day) technology will affect them in the future. In this episode I explain some of the mechanisms behind algorithmic personalization.

Stream the episode at NPO.nlย (the part with me starts at about 05:00), or see the website of Human for more information on the episode.

Hosted 8th Recsys Amsterdam Meetup

๐Ÿ“… October 20, 2017 ๐Ÿ• 12:47 ๐Ÿท Blog

Thursday 19 October, I had the pleasure of hosting the 8th Recommender Systems Amsterdam meetup at FDMG/Company.info. The meetup’s theme was media-content recsys, and we had three talks from industry, dealing with recommending tv programs, music videos, and text articles);

  1. Ghida Ibrahim (Senior Data Scientist, (formerly at)ย Liberty Global):ย “Recommender systems for video and TV products”
  2. Bouke Huurninkย and Roman Ivanov (XITE):ย “Music Video Recommendation@XITE”
  3. Robbert van der Pluijm (Head of Bibblio Labs,ย Bibblio):ย “Scaling a recommendation service – a threefold story”

Company.info wrote a small blog post about it, check it out here:ย Meetup: het succes van algoritmen en systemen voor personalisatie en aanbevelingen

Featured in FD on the value of (personal) data

๐Ÿ“… July 3, 2017 ๐Ÿ• 09:37 ๐Ÿท Media

In today’s edition of Het Financieele Dagblad, I am quoted in an article on the value of (personal) data titled “Wanneer je gegevens geld waard zijn”;

De kennis die met die cookies wordt verzameld, wordt vervolgens verkocht aan nog eens tientallen bedrijven die daarmee hun reclameboodschappen gericht kunnen afvuren. โ€˜Waar je ook komt op het web, je laat altijd digitale sporen naโ€™, zegt David Graus, die twee weken geleden promoveerde op dit onderwerp aan de Universiteit van Amsterdam. โ€˜Uit al die sporen voorspellen de bedrijven je gedrag en op basis daarvan plaatsen ze een advertentie.โ€™ […]

De mogelijkheden met data gaan verder, stelt Graus. Stel dat op basis van gedrag van vrienden, familieleden, likes, posts en zoekopdrachten wordt geconcludeerd dat je rookt. Terwijl je dat zelf nooit hebt aangegeven. โ€˜Daarmee geef je privacy wegโ€™, aldus Graus.

Read the full article here.

I am a doctor!

๐Ÿ“… June 18, 2017 ๐Ÿ• 14:34 ๐Ÿท Blog

And it was a beautiful day.ย Thanks to everyone who attended my defense, to Daan for this (+ more) great picture, my paranymphs Rutger and Marijn for nymphing like a boss, and my committee for grilling but not burning me. Band pic:

Me and the gang. Photo by Daan Odijk.