David Graus

In defense of algorithms

Pre-print of position paper “SMART Journalism: Personalizing, Summarizing, and Recommending Financial Economic News”

Friday, June 1, 2018

Our position paper “SMART Journalism: Personalizing, Summarizing, and Recommending Financial Economic News” was accepted at Algorithmic Personalization and News (APEN18) workshop, held at ICWSM ’18!

In this paper, we detail some of the ideas and opportunities of personalization in the domain of financial economic news. Read the pre-print below!

  • [PDF] M. Sappelli, D. M. Chu, B. Cambel, D. Graus, and P. Bressers, “Smart journalism: personalizing, summarizing, and recommending financial economic news,” in The algorithmic personalization and news (apen18) workshop at icwsm ’18, 2018.
    title={SMART Journalism: Personalizing, Summarizing, and Recommending Financial Economic News},
    author={Sappelli, Maya and Chu, Dung Manh and Cambel, Bahadir and Graus, David and Bressers, Philippe},
    booktitle={The Algorithmic Personalization and News (APEN18) Workshop at ICWSM '18},

Featured in article on ‘robo-journalism’ in the Netherlands

Sunday, May 13, 2018

Stimuleringsfonds voor de Journalistiek published an article on ‘robo-journalism’, where I say something about the SMART Journalism project we are doing at FDMG, which involves personalization and summarization of newspaper articles. Read it here! (pdf). Snippet:

Door introteksten te personaliseren, kun je meer doelgroepen bedienen.’ Bij het genereren van gepersonaliseerde intro’s op basis van artikelen, komt behoorlijk wat techniek kijken, vertelt David Grauslead data scientist van het project bij het FD. ‘In de robotjournalistiek wordt nu vooral gewerkt aan het omzetten van gestructureerde data naar teksten. Wat wij willen is teksten maken op basis van door mensen geschreven teksten. Dat is behoorlijk cutting edge. We hebben daarom ook nauwelijks voorbeelden waar we ons op kunnen baseren.’

The Filter Bubble doesn’t exist!

Thursday, March 29, 2018
1 comment

Yesterday I gave a (tongue-in-cheek) talk on algorithmic personalization at the VOGIN-IP Lezing 2018, and brought five pieces of evidence to prove the “filter bubble” doesn’t exist. Check out my slides (in Dutch) by clicking on the picture below!

“The birth of collective memories” published in JASIST!

Monday, February 5, 2018

The journal paper “The birth of collective memories: Analyzing emerging entities in text streams” I wrote with Daan Odijk and Maarten de Rijke is now (finally) published at JASIST! It is published under OpenAccess/CC BY 4.0 and available in “early view” (published before it’s published) in the Wiley Online Library. Click on the image below to access it:


Blogpost on Predictive insights from company information

Wednesday, January 31, 2018

For Company.info I wrote a short blog post explaining the current state-of-the-art, our current, and future projects that involve machine learning and company information. Click the image below to read the post! (in Dutch)


The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams

Monday, December 11, 2017

Our paper “The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams” was accepted for publication at JASIST (the Journal of the Association for Information Science and Technology)! Grab a pre-print here:

  • [PDF] D. Graus, D. Odijk, and M. de Rijke, “The birth of collective memories: analyzing emerging entities in text streams,” Journal of the association for information science and technology, 2018.
    title={The birth of collective memories: Analyzing emerging entities in text streams},
    author={Graus, David and Odijk, Daan and de Rijke, Maarten},
    journal={Journal of the Association for Information Science and Technology},

This paper is is:
1. My first journal paper
2. Based on Chapter 3 of my PhD thesis “Entities of Interest — Discovery in Digital Traces
3. The first collabo on a paper (on paper) between the FD Mediagroep, Blendle, and the UvA
4. The tombstone on my academic career! (?)

In this paper we study news and social media streams spanning over 18 months, and comprising over 579 million documents, and analyze ’emergence patterns’ of entities, i.e., how a real-world entity (such as a person, organization, product, etc.) appears in these documents in the timespan between the entity’s first mention in online text streams, and when an article devoted to the entity is subsequently added to Wikipedia.

Financial News Mining @ FD Mediagroep/Company.info Slidedeck

Friday, November 24, 2017

Here are the slides of a talk I gave at the Data Science Northeast Netherlands Meetup, where I detail the custom in-house entity linking framework, sentiment analysis, and entity salience scoring model we developed for Company.info (part of FD Mediagroep), in addition to showing some example applications of our corpus of news articles linked to organization profiles.

I’m sharing it here because I think it’s cool, since it’s one of the first project I’ve done at Company.info! Gives you some idea of what we’re working on..

In “Denktank” on algorithms, behavioral analysis, and personalization

Monday, November 13, 2017

My debut on national TV ;-)! Denktank is a TV show where youngsters explore and think about how (current day) technology will affect them in the future. In this episode I explain some of the mechanisms behind algorithmic personalization.

Stream the episode at NPO.nl (the part with me starts at about 05:00), or see the website of Human for more information on the episode.

Hosted 8th Recsys Amsterdam Meetup

Friday, October 20, 2017

Thursday 19 October, I had the pleasure of hosting the 8th Recommender Systems Amsterdam meetup at FDMG/Company.info. The meetup’s theme was media-content recsys, and we had three talks from industry, dealing with recommending tv programs, music videos, and text articles);

  1. Ghida Ibrahim (Senior Data Scientist, (formerly at) Liberty Global): “Recommender systems for video and TV products”
  2. Bouke Huurnink and Roman Ivanov (XITE): “Music Video Recommendation@XITE”
  3. Robbert van der Pluijm (Head of Bibblio Labs, Bibblio): “Scaling a recommendation service – a threefold story”

Company.info wrote a small blog post about it, check it out here: Meetup: het succes van algoritmen en systemen voor personalisatie en aanbevelingen

Featured in FD on the value of (personal) data

Monday, July 3, 2017

In today’s edition of Het Financieele Dagblad, I am quoted in an article on the value of (personal) data titled “Wanneer je gegevens geld waard zijn”;

De kennis die met die cookies wordt verzameld, wordt vervolgens verkocht aan nog eens tientallen bedrijven die daarmee hun reclameboodschappen gericht kunnen afvuren. ‘Waar je ook komt op het web, je laat altijd digitale sporen na’, zegt David Graus, die twee weken geleden promoveerde op dit onderwerp aan de Universiteit van Amsterdam. ‘Uit al die sporen voorspellen de bedrijven je gedrag en op basis daarvan plaatsen ze een advertentie.’ […]

De mogelijkheden met data gaan verder, stelt Graus. Stel dat op basis van gedrag van vrienden, familieleden, likes, posts en zoekopdrachten wordt geconcludeerd dat je rookt. Terwijl je dat zelf nooit hebt aangegeven. ‘Daarmee geef je privacy weg’, aldus Graus.

Read the full article here.