Joined the #RecSys2021 organizing committee

📅 September 26, 2020 • 🕐 14:49 • 🏷 Blog • 👁 4

After attending the beautiful virtual 14th ACM Conference on Recommender Systems (RecSys2020), I am happy to start looking forward to RecSys2021, which will be held in Amsterdam!

I am super excited to share that I’ve joined the organizing committee of RecSys2021 as local outreach chair, which means I’ll help out assisting the other chairs and linking the (local) industry and companies to the conference.

I’m looking forward to it! I have quite fond memories of co-organizing last year’s DIR 2019, and helping out the local organization of ECIR 2014 in Amsterdam.

ACM RecSys 2021: September 27 – October 1, in Amsterdam

Internships and MSc. projects at Randstad Groep Nederland

📅 July 6, 2020 • 🕐 13:28 • 🏷 Blog • 👁 182

Come join us in Diemen!

About Randstad

Work with impact. At Randstad Groep Nederland IT you keep the country moving, enabling people across sectors to do their work, getting pizza on your table and your suitcase on the plane. Your AI solutions mean tomorrow’s recruiter is smarter and faster but still embodies our human forward approach, combining tech with a personal touch and putting people first – including you. Constantly experimenting, working on new NLP use cases and matching systems or expanding our self-service data platform. If you bring the idea we will provide the freedom to explore, so you can help us shape the world of work. 

Data Science @ RGN

Randstad IT is organized in a variation of the Spotify Engineering Model with squads, tribes, and chapters. Our data science chapter spans 12 data scientists, data engineers and machine learning engineers over 3 departments (IT, finance, and marketing), across 6 different teams. These teams work on recommender systems for algorithmic job matching, natural language processing and information extraction, forecasting, and more. We are further interested in AI fairness and auditing, explainability, and transparency.

Who are you?

We’re looking for students studying AI, data science, or related programs, for either graduation projects or regular internships. Fluency in python is required, and we expect our interns to work autonomously. However, as an intern you’ll be a fully fledged member of our chapter, which means you get to benefit from the knowledge that is being shared in our chapter.

Here’s the overview of our suggested projects:

  • (Deep) Reinforcement Learning-based Planning & Poolmanagement
  • Writing style transfer learning
  • Career pathing MVP
  • Pairwise learning to rank for SmartMatch
  • Revenue forecasting using time-series algorithms
  • Structured information extraction from resumes
  • Salary parsing from vacancies
  • Record linkage for company linking
  • Free text notes and comments for improved job matching

Joined the board of SETUP

📅 May 29, 2020 • 🕐 12:32 • 🏷 Blog • 👁 4

I have joined the board of SETUP, a Utrecht-based medialab established in 2010. SETUP’s mission is:

to educate a wide audience, providing them with the tools necessary to design this brave new world, and infuse it with human values and new-found creativity.


This mission perfectly fits my personal conviction that knowledge and understanding of technology through media/algorithmic-literacy — not fear and repression — is vital in progressing into our technology-infused future! See, e.g., what I wrote about it on the neutrality of algorithms, or “algorithmic literacy.”

photo: Sebastiaan ter Burg ( for SETUP

Prior to joining their board, I have been following SETUP for a couple of years, joining some of their meetups, and giving a talk at one of their events in 2018 “leven met algoritmen.” I am very excited to start as a board member and help set up SETUP’s future!

I have emerged…

📅 May 9, 2020 • 🕐 10:12 • 🏷 Blog • 👁 162

… as an entity in the Google Knowledge Graph!

Which is funny, because “emerging entities” were the main topic of my PhD Thesis [1]. With my co-authors I’ve published research on:

  1. Learning how to recognize “out-of-knowledge base” entities emerging on social media [2]
  2. How our collective memory is formed through “emerging entities” on Wikipedia [3], and more generally
  3. Entity retrieval and ranking [4] where Google’s so-called “Knowledge Panels” often served as examples…
Google’s AI unleashes the long tail?

(FYI: I’m not sure how I ended up there, the metadata seems to be coming from Google Scholar)


[1] [pdf] D. Graus, “Entities of interest — discovery in digital traces,” PhD Thesis, 2017.
title={Entities of Interest — Discovery in Digital Traces},
author={Graus, David},
school={Informatics Institute, University of Amsterdam},
[2] [pdf] [doi] D. Graus, M. Tsagkias, L. Buitinck, and M. de Rijke, “Generating pseudo-ground truth for predicting new concepts in social streams,” in Advances in information retrieval, Cham, 2014, p. 286–298.
author={Graus, David and Tsagkias, Manos and Buitinck, Lars and de Rijke, Maarten},
title={Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams},
booktitle={Advances in Information Retrieval},
publisher={Springer International Publishing},
series = {ECIR '14}
[3] [pdf] [doi] D. Graus, D. Odijk, and M. de Rijke, “The birth of collective memories: analyzing emerging entities in text streams,” Journal of the association for information science and technology, vol. 69, iss. 6, pp. 773-786, 2018.
author = {Graus, David and Odijk, Daan and de Rijke, Maarten},
title = {The birth of collective memories: Analyzing emerging entities in text streams},
journal = {Journal of the Association for Information Science and Technology},
year = {2018},
volume = {69},
number = {6},
pages = {773-786},
doi = {10.1002/asi.24004},
url = {},
eprint = {},
[4] [pdf] [doi] D. Graus, M. Tsagkias, W. Weerkamp, E. Meij, and M. de Rijke, “Dynamic collective entity representations for entity ranking,” in Proceedings of the ninth acm international conference on web search and data mining, New York, NY, USA, 2016, p. 595–604.
author = {Graus, David and Tsagkias, Manos and Weerkamp, Wouter and Meij, Edgar and de Rijke, Maarten},
title = {Dynamic Collective Entity Representations for Entity Ranking},
year = {2016},
isbn = {9781450337168},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {},
doi = {10.1145/2835776.2835819},
booktitle = {Proceedings of the Ninth ACM International Conference on Web Search and Data Mining},
pages = {595–604},
numpages = {10},
keywords = {fielded retrieval, entity retrieval, entity ranking, content representation},
location = {San Francisco, California, USA},
series = {WSDM '16}

Panel @ CPDP2020: "Algorithms and AI-driven technologies in the information society"

📅 February 4, 2020 • 🕐 10:03 • 🏷 Blog • 👁 4

I was invited by UvA’s Information, Communication and the Data Society (ICDS) to participate in a panel at the Conference on Privacy and Data Protection, which was focused on AI.

The recording of the panel is now online, watch me telling a room full of (highly) privacy-aware (and cookie-averse) people that Cambridge Analytica nudging people to “politically activate them” with tailored information can be a “democratic good” 😅.

See the recording below:

For more information, see CPDP’s page of the panel.

“Bias in Recommendations” lecture @ SIKS Course on advances in IR

📅 October 8, 2019 • 🕐 20:04 • 🏷 Blog • 👁 95
📸 by

Enjoyed giving a lecture at the SIKS Course “Advances in Information Retrieval” at the Mitland Hotel in Utrecht. I also pitched DIR 2019 😅 (as evidenced by the picture above from Arjen). See my slidedeck below!

This talk is loosely based on (part of) the talk I gave at the ACM RecSys Summerschool, but I added a few slides on dealing with implicit feedback (= clicks), and popularity bias.

“RecSys in the Media Industry” Lecture at RecSys Summer School

📅 September 11, 2019 • 🕐 13:39 • 🏷 Blog • 👁 100

With Daan Odijk I gave a lecture + hands-on workshop at the ACM Summer School on Recommender Systems in Gothenburg, Sweden on RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.

📸 by Alan Said

For it, we had a long (90+ min) lecture combining insights, experiences, and projects from our work at RTL and Blendle (Daan), and FD Mediagroep (me).

In addition, we did a small hands-on workshop, implementing a content-based re-ranker for WikiNews.

See our slides and notebooks here:

See a tweet by @alansaid, here:

Finally, see my slidedeck here:

“Improving automated segmentation of radio shows with audio embeddings”

📅 July 5, 2019 • 🕐 15:41 • 🏷 Blog and Research • 👁 79

Update (28/1/2020): Oberon’s thesis was accepted and will be published at the IEEE 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020), to be held May 4-8 in Barcelona, Spain! The submission is co-authored with Klaus Lux and myself.

Oberon Berlage recently successfully defended his MSc. thesis (title above!) for the Data Science Master at University of Amsterdam, and graduated with a whopping 9!

He’s the first academic offspring of our AI Team @ FD Mediagroep, and worked on BNR SMART Radio‘s segmenter. Oberon improved our text-based segmenter by adding audio embeddings, improving the F1 score with +32%!

His thesis is now online, check it out at: