David Graus • Page 2 of 24 • AI for OpenGov

Interview in Od

In dit interview in Od – Overheidsdocumentatie vertel ik samen met Maarten Marx over onze workshop “AI en open overheid”. Daarnaast krijg ik de ruimte om kort toe te lichten waarom ik zo enthousiast ben over ons werk bij het ICAI OpenGov Lab (zo enthousiast dat ik mijn loopbaan in het bedrijfsleven — voor nu — achter me heb gelaten).

Het is namelijk heerlijk om mijn tanden te zetten in een domein waar zoveel uitdagingen schreeuwen om AI-oplossingen (en eerlijk: hoe vaak kom je dat nou écht tegen?). Van het doorzoekbaar maken van grote archieven vol ambtelijke communicatie voor Woo-verzoeken, tot het inzichtelijk maken van teksten vol jargon voor burgers, tot discovery-vraagstukken voor journalisten.

Problemen waarbij simpelweg opschalen met extra handjes niet werkt (zie daarvoor de vele rapporten rond de praktijk van de uitvoering van de Wet Open Overheid), en de verantwoorde inzet van “mijn” AI (information retrieval en natural language processing) niet alleen leuke spielerij is, maar ook noodzaak.

Lees het interview hier: https://od-online.nl/artikel/ai-kan-beleid-slimmer-maken/

#newjob

I’m starting something new! I’ve joined the Institute for Logic, Language and Computation (ILLC) at the University of Amsterdam as an Assistant Professor, where I’ll be leading the newly established Artificial Intelligence for Open Government (AI for OpenGov) ICAI Lab. I am very excited to dive into this societally relevant (in particular in today’s political climate) topic, and I am convinced AI can play a meaningful and important role in enabling a more transparent government.

After eight years of working in industry, returning to academia was never part of some grand master plan (there is none), but I was particularly drawn to this ICAI Lab, which has been set up with the Rijksorganisatie voor Informatiehuishouding (RvIHH), as it means I get to do applied research in Information Retrieval and Natural Language Processing, with real impact. The ICAI Lab will kick off with three PhD students under my supervision, who are also employed at the RvIHH (and hence close on the action). We get to work with a so-called Living Lab, a “real” search engine with (actual) open data: UvA’s WooGle. In sum, applied research, on a societally relevant and impactful topic, with real data and users! What’s not to like?

I consider myself lucky to be able to take this new turn in my career and am looking forward to this new adventure!

Opinie: Fact-checking verdwijnt op Meta, maar de échte beïnvloeding komt later

Zuckerberg schrapt factchecking op Facebook en Instagram. Ophef alom. Minder moderatie, meer desinformatie, minder grip op schadelijke content—zo klinkt het althans. Maar hoe erg is dat echt? Toen Musk bij X de moderatie afbrak, ja, toen veranderde X in een open riool. Maar gebruikers weken ook uit naar alternatieven als Mastodon en Bluesky (waar het eigenlijk best gezellig is). De macht van X brokkelt daarmee af, en hetzelfde zal ongetwijfeld met Meta gebeuren. Sociale media fragmenteren, hun invloed versnipperd over steeds meer platforms. Geen ramp.

Een échte zorg ligt elders. Niet bij sociale media, maar bij de systemen die de toekomst van informatievoorziening bepalen, en die eveneens in handen zijn van deze big tech: generatieve AI-modellen. Terwijl sociale media versnipperen, consolideert AI zich juist. AI ontwikkelen is namelijk duur—te duur voor “open” initiatieven zoals BLOOM van Hugging Face, of ‘s Neerlands eigen GPT-NL: sympathieke projecten die onbedoeld toch vooral laten zien dat het onmogelijk is om zowel duurzame, eerlijke als bruikbare AI te ontwikkelen.

En dat is een probleem. Recent onderzoek toont aan dat AI modellen de ideologische voorkeuren van hun makers absorberen en reproduceren. Dat gebeurt subtiel: in de manier waarop vragen worden beantwoord, welke perspectieven worden versterkt, en hoe informatie wordt geframed.

Waar je sociale media platforms nog betrekkelijk eenvoudig de rug kan toekeren, gaat dat bij AI moeilijk. Je kunt zelf stoppen het te gebruiken, maar AI-systemen zitten steeds dieper ingebed in onze gehele informatieketen: in zoekmachines, mediaproductie, e-mailprogramma’s, programmeertools; kortom: aan de kant van zowel (ons) gebruikers als producenten.

Hoewel big tech nu juist met ethische “guardrails” diversiteit probeert te benadrukken (en dat gaat niet altijd even goed), lijkt de draai van Zuckerberg naar “minder censuur” een voorbode van een ruk naar rechts door big tech. Wanneer deze nieuwe koers zich doorzet in AI-systemen, sijpelen de waarden vanzelf door in alles wat ze genereren, interpreteren, en (re)produceren. Zo kan generatieve AI subtiel de framing van nieuws, maatschappelijke debatten en politieke voorkeuren beïnvloeden.

Het probleem is dus niet dat fact-checking op Facebook verdwijnt—dat kun je ontlopen. De kater komt later, wanneer AI stilzwijgend Silicon Valleys nieuwe waardering voor “vrije meningsuiting” heeft omarmd, en de grenzen verschuift van wat we collectief als waar en relevant beschouwen.

Opinion: Fact-Checking Disappears on Meta, But the Real Influence Comes Later

Zuckerberg scraps fact-checking on Facebook and Instagram. Outrage ensues. Less moderation, more disinformation, less control over harmful content — or at least, that’s what it sounds like. But how bad is it really? When Musk dismantled moderation on X, yes, the platform turned into an open sewer. But users also flocked to alternatives like Mastodon and Bluesky (which are actually quite pleasant). As a result, X’s power diminishes, and the same will undoubtedly happen to Meta. Social media is fragmenting, with its influence scattered across an increasing number of platforms. No disaster.

A real concern lies elsewhere. Not with social media, but with the systems shaping the future of information — systems that are also in the hands of big tech: generative AI and Large Language Models. While social media is splintering, AI is consolidating. This is because developing AI is expensive — too expensive for “open” initiatives like Hugging Face’s BLOOM or the Netherlands’ own GPT-NL: sympathetic projects that unintentionally reveal it’s impossible to develop AI that is sustainable, fair, and useful all at once.

And that’s a problem. Recent research shows that Large Language Models absorb and reproduce the ideological preferences of their creators. This happens subtly: in how questions are answered, which perspectives are amplified, and how information is framed.

While you can easily turn your back to social media platforms, doing so with AI is much harder. You can stop being a user, but AI systems are becoming ever more deeply entangled in our entire information ecosystem: in search engines, media production, email programs, coding tools — in short, on both the user and producer sides.

Although big tech is currently applying ethical “guardrails” to promote diversity (not always successfully), Zuckerberg’s shift toward “less censorship” seems to foreshadow a broader rightward, populistic shift in big tech. When this new direction inevitably reaches AI systems, those values will seep into everything they generate, interpret, and (re)produce. In this way, generative AI can subtly influence the framing of news, societal debates, and political preferences.

So, the issue isn’t that fact-checking on Facebook is disappearing — that’s something we can easily avoid. The real reckoning comes later, when AI quietly embraces Silicon Valley’s newfound appreciation for “free speech” and shifts the boundaries of what we collectively consider true and relevant.

te gast in Filosofie in Actie Podcast: in gesprek met David Graus

Ik was te gast in de podcast van Filosofie in Actie, waarin ik in gesprek ga met Piek Knijff over mijn achtergrond, interesse, en ideeën over aanbevelingssystemen (recommender systems) en het ontwikkelen van ethische AI.

we hebben het ook kort over mijn activistische periode als algoritmebeschermheer 😏, en benoemen het opinieartikel die in schreef met Maarten de Rijke in NRC, na 8 jaar nog steeds actueel, al zeg ik het zelf: Wij zijn racisten, daarom Google ook.

Beluister de aflevering op Spotify of Apple!

Special Issue on RecSys in HR at Frontiers in Big Data

During the opening of our 4th Workshop on Recommender Systems for Human Resources, Mesut Kaya announced our Research Topic (~ Special Issue) on Recommender Systems for Human Resources at the Frontiers in Big Data Journal.

Authors at the workshop, past and present, are particularly invited to (re-)submit their extended paper to this Journal issue. The deadline for submitting summaries is October 31, 2024, and the deadline for actual manuscrip submission is January 31, 2025!

This Research Topic, developed in conjunction with the 4th Workshop on Recommender Systems for Human Resources (RecSys in HR 2024), explores the dynamic interplay between Artificial Intelligence (AI) and Human Resources (HR) Technologies. Focusing on Recommender Systems (RecSys) as a prime example of AI applications, this themed article collection provides a comprehensive view of their role in the HR domain.

This issue will be edited by myself, Chris Johnson, Mesut Kaya, Toine Bogers and Jens-Joris Decorte! For more details and the full CFP, see: https://www.frontiersin.org/research-topics/64365/recommender-systems-for-human-resources

Talk on Responsible AI at UWV’s IT Conference

On Tuesday, October 8 I will be giving a talk at a conference organized by the UWV (the Dutch Employee Insurance Agency), my talk is titled Challenge Accepted: Responsible Algorithms in the World of Work and News, and will be about designing fair algorithms, drawing from my own experiences building recommender systems at FD Mediagroep and Randstad.

For more details, see the (Google Translated) blurb below:

Algorithms increasingly influence decisions in our daily lives, from work to media consumption. But how do we ensure that these technologies are fair, inclusive and ethical? In this session, David Graus, AI expert with years of experience in building search and recommendation systems, shares insights from his own practice, including developing a recommender system for the FD and matching work and talent at Randstad.

Using these practical examples, we discuss challenges of bias, ethics and inclusivity in algorithms. We show how you can engage with stakeholders to align technology with values and how you can use algorithms responsibly within the world of work and media.

Challenge accepted: together we take on the challenge of developing fair and inclusive algorithms that have a positive impact on our society.

Who is David Graus?
David Graus is an expert in information retrieval with a PhD in search engine technology. He works as a lead data scientist at Randstad, where he helped build AI systems for recruitment and selection. In addition to designing and building such systems, he researches their implications. David is academically active and regularly publishes papers, organizes workshops, and participates in an EU-funded research project focused on anti-discrimination in recruitment and selection algorithms.

RecSys in HR program and panel announced

With a mere 2 weeks until our 4th Recommender Systems for Human Resources Workshop will be held in Bari, Italy, we have shared the final program and panel of our workshop.

Papers

We were able to accept a total of 10 high quality submissions in the workshop’s proceedings! They are:

Finding the perfect match at scale: A quest on freelancer-project alignment for efficient multilingual candidate retrieval (Warren Jouanneau, Marc Palyart and Emma Jouffroy)
MELO: An Evaluation Benchmark for Multilingual Entity Linking of Occupations (Federico Retyk, Luis Gasco, Casimiro Pio Carrino, Daniel Deniz Cerpa and Rabih Zbib)
Pseudo-online Measurement of Retrieval Recall for Job Recommendations – A case study at Indeed (Liyasi Wu, Yi Wei Pang and Warren Cai)
On the Biased Assessment of Expert Finding Systems (Jens-Joris Decorte, Jeroen Van Hautte, Chris Develder and Thomas Demeester)
Hardware-effective Approaches for Skill Extraction in Job Offers and Resumes (Laura Vásquez-Rodríguez, Bertrand Audrin, Samuel Michel, Samuele Galli, Julneth Rogenhofer, Jacopo Negro Cusa and Lonneke van der Plas)
A Dynamic Jobs-Skills Knowledge Graph (Alejandro Seif, Sarah Toh and Hwee Kuan Lee)
Combined Unsupervised and Contrastive Learning for Multilingual Job Recommendation (Daniel Deniz, Federico Retyk, Laura García-Sardiña, Hermenegildo Fabregat, Luis Gasco and Rabih Zbib)
Parallel Computation-Driven Stable Matching for Large-Scale Reciprocal Recommender Systems (Kento Nakada, Kazuki Kawamura and Ryosuke Furukawa)
Enhancing Reliability in Recommendation Systems: Beyond point estimations to monitor population stability (Yingshi Chen, Mohit Jain, Vaibhav Sawhney and Liyasi Wu)
Creating Healthy Friction: Determining Stakeholder Requirements of Job Recommendation Explanations (Roan Schellingerhout, Francesco Barile and Nava Tintarev)

For the full program, please see: https://recsyshr.aau.dk/program/

Panel

Finally, as every year we are hosting a panel on job recommendation, algorithmic hiring, and related HR Tech tasks, and I am happy share our full list of invited panelists! Each year we try to strike a balance and find different perspectives and angles in the broad field where AI, RecSys, and HR meet. I think we’ve done a pretty good job this year, if I may say so myself, considering the following list of panelists:

Hilke Schellmann is an investigative reporter and assistant professor of journalism at New York University, and author of “The Algorithm — How AI Decides Who Gets Hired, Monitored, Promoted, and Fired and Why We Need to Fight Back Now“
Silvia Ecclesia is a PhD Researcher at the Norwegian University of Science and Technology (NTNU), contributing to the Horizon Europe project “BIAS: Mitigating diversity biases in the labor market” focusing on AI in recruitment and candidate selection in Italy.
Vincent Slot is Team Lead R&D at Textkernel, who is leading the technical aspects of the Responsible AI efforts within Textkernel.
Tommaso Di Noia is Professor of Computer Science at Politecnico di Bari, specializing in theoretical, algorithmic, and experimental aspects of recommender systems

Excited to have these experts share their insights at our workshop! For more details, including small bios of all our panelists, please see: https://recsyshr.aau.dk/panel/

Talk at CHI Nederland’s Experience & Beyond 2024

I’m looking forward to giving a talk to the CHI Nederland‘s event: Experience & Beyond 2024, themed “Breaking Barriers: How AI is Being Infused into Our Daily Life.” Here I will be sharing with the HCI/UX crowd on how we can design AI systems responsibly and consciously to do good things (e.g., enable more diverse news consumption and mitigate bias in hiring).

Update 18 Nov: See my slides here:

“Fairness and Bias in Algorithmic Hiring” Survey Paper published in ACM TIST

The aforementioned survey paper on fairness and bias in algorithmic hiring, in which I contributed a section on bias measurement and mitigation in practice, has been accepted for publication in ACM Transactions on Intelligent Systems and Technology (TIST)!

The paper is published open access, and is currently to be found in the “Just accepted” section of the journal. See the DOI here: http://dx.doi.org/10.1145/3696457

joined AIQUITY’s advisory board

Happy to announce I’ve joined the advisory board of AIQUITY,

“At AIQUITY we are convinced that the active use of inclusive algorithms and AI in systems, processes and institutions will accelerate the building and establishment of an equal and fair society.”

Featured in “ING Sector” on Artificial Intelligence

I was interviewed for the ING Sector; a quarterly magazine for corporate Netherlands, on (generative) AI. In it, I express my cautious skepticism around enterprise generative AI applications, with the quote:

we have not yet seen the big killer application of generative AI

My 2 cts: While startups, cloud providers, and consultants have been tumbling over each other for the past 2+ years trying to find (or sell) the next big killer generative AI app, we still haven’t seen it. We can effectively leverage LLMs for a variety of natural language tasks (such as information extraction), content representation (for downstream tasks), and many other useful-but-fringe use-cases.

These are all valid, but merely incremental (and in some cases, inferior, and in many cases, much more expensive) approaches over “traditional” machine learning.

The real value of LLMs and #genAI will lie in supporting us in our daily (textual; information access-related) tasks. Roll on, slope of enlightenment!

Read the full article (in Dutch) here.

Invited talk on “algorithms; who is behind the wheel?” at inspiration day for vocational teachers

I gave an invited talk at the Inspiration Day for ICT and Creative Industry teachers, organized by the MBO Raad, the council for Vocational Education and Training (VET) in the Netherlands. In my talk I answered the question: “Algorithms, who is behind the wheel?” TL;DR: my answer? All of us 💪.

The program further included a keynote by Ionica Smeets (Professor of Science Communication) and another invited talk by Jeroen Junte .

See the slides of my talk below:

https://www.slideshare.net/slideshow/embed_code/key/qDCA45shdRS5ko?hostedIn=slideshare&page=upload

RecSys in HR 2024 CFP published

We have published the call for papers for our Fourth Workshop on Recommender Systems for Human Resources (RecSys in HR 2024), to be held at the 18th ACM Conference on Recommender Systems in October in Bari 🇮🇹!

Do you work in AI, HR, and/or RecSys? Please consider submitting your work to our workshop! We accept research and position papers between 4-10 pages.

The (most) important dates for authors:

Paper submission deadline: August 23, 2024
Notification of acceptance: September 17
Workshop date: 14-18 October (exact date TBD)

Take a look at the previous years’ proceedings for inspiration on the type of work that gets published:

“Fairness and Bias in Algorithmic Hiring: a Multidisciplinary Survey” preprint available

Together with some of the researchers in FINDHR we have authored and submitted an extensive survey on algorithmic hiring. The preprint is available here:

In this multidisciplinary work we bring together different perspectives from computer science, law, and practitioners to extensively survey literature and classify so-called “bias conducive factors,” i.e., factors that contribute to bias in the algorithmic hiring process. These factors span the complete hiring pipeline, and are classified into three main families: institutional biases, individual preferences, and technology blindspots.

In addition, our paper surveys bias measures (n=21) and bias mitigation strategies (n=12) that have been applied and studied specifically in the context of algorithmic hiring, which we present in unified notation.

Finally, our survey lists datasets, summarizes the relevant legal landscape (w.r.t. regulations and non-discrimination provisions concerning algorithmic hiring in the EU and the US), and shows practical considerations and examples for bias mitigation in practice (which was my main contribution to this paper).

One of my personal main positive takeaways from this paper is around the potential of positive effects that algorithmic components can have in an inherently biased and complex hiring process, i.e.:

One upshot of understanding bias as an inherently intersectional process is that it also offers a way to reduce discrimination. Since the factors that create bias are interrelated and mutually reinforcing, by halting or ameliorating one BCF, we may introduce positive feedback loops on other BCFs. By removing the discriminatory effect of any one factor, we can hope to reduce its influence on the other factors that reinforce each other in a discriminatory way.

All in all, I am very happy and proud to be listed in this monumental work, which surveys a highly complex field and leaves both enough pointers to get started as useful recommendations for future work, grounded in (gaps in) extensive literature.