Featured in Recsperts on ‘Recommender Systems in Human Resources’

It is now official, right? I am now a recommender system expert (by virtue of appearing on Marcel Kurovski‘s Recsperts podcast series).

It was great fun to geek out with a knowledgeable host on anything recsys in HR, we touch on fairness, difficulties with delayed feedback signals, our RecSys in HR Workshop, and how I discover obscure rap music (OK, that’s not HR).

AI for Good at Felix Meritis

Yesterday evening I was a guest next to Dasha Simons (IBM) and Oumaima Hajri (Hogeschool Rotterdam) in Felix Meritis at the invitation of Laurens Vreekamp to talk about — among other things — fair AI, ethics, and algorithmic bias.

It was nice and important to be able to discuss these topics from an optimistic and pragmatic perspective, because algorithms — when designed thoughtfully — offer so much more than scary filter bubbles or addictive “social dilemmas.”

See the event page for more details

“Transfer learning for multilingual vacancy text generation” preprint available

Anna Lőrincz‘ UvA MSc. data science thesis “Transfer learning for multilingual vacancy text generation” — which was graded a 9/10 💫 — was recently accepted at the The Second Version of Generation, Evaluation & Metrics (GEM) Workshop 2022 which will be held as part of EMNLP, December 7-11, 2022!

Get the pre-print here:

  • [PDF] [DOI] A. Lőrincz, D. Graus, D. Lavi, and J. L. M. Pereira, “Transfer learning for multilingual vacancy text generation,” in Proceedings of the 2nd workshop on natural language generation, evaluation, and metrics (gem), Abu Dhabi, United Arab Emirates (Hybrid), 2022, p. 207–222.
    [Bibtex]
    @inproceedings{lorincz2022transfer,
    author = {L{\H{o}}rincz, Anna and Graus, David and Lavi, Dor and Pereira, Jo{\~a}o L. M.},
    title = {Transfer learning for multilingual vacancy text generation},
    booktitle = "Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates (Hybrid)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.gem-1.18",
    doi = "10.18653/v1/2022.gem-1.18",
    pages = "207--222",
    abstract = "Writing job vacancies is a repetitive and expensive task for humans. This research focuses on automatically generating the benefit sections of vacancies at redacted from job attributes using mT5, the multilingual version of the state-of-the-art T5 transformer trained on general domains to generate texts in multiple languages. While transformers are accurate at generating coherent text, they are sometimes incorrect at including the structured data (the input) in the generated text. Including the input correctly is crucial for vacancy text generation; otherwise, the candidates may get misled. To evaluate how the model includes the input we developed our own domain-specific metrics (input generation accuracy). This was necessary, because Relation Generation, the pre-existing evaluation metric for data-to-text generation uses only string matching, which was not suitable for our dataset (due to the binary field). With the help of the new evaluation method we were able to measure how well the input is included in the generated text separately for different types of inputs (binary, categorical, numeric), offering another contribution to the field. Additionally, we also evaluated how accurate the mT5 model generates the text in the requested language. The results show that mT5 is very accurate at generating the text in the correct language, at including seen categorical inputs and binary values correctly in the generated text. However, mT5 performed worse when generating text from unseen city names or working with numeric inputs. Furthermore, we found that generating additional synthetic training data for the samples with numeric input can increase the input generation accuracy, however this only works when the numbers are integers and only cover a small range.",
    }

In her work, Anna explores transformer models for data-to-text generation, or more specifically: given structured inputs such as categorical features (e.g., location), real valued features (e.g., salary of hours of work per week), or binary features (e.g., contract type) that represent benefits of vacancy texts, the task is to generate a natural language snippet that expresses said feature.

Anna finds that using transformers greatly increases (vocabulary) variation when compared to template-based models, and needs less human effort. The results were — to me — surprisingly good, another proof that transformers are taking over the world and making traditional NLP methods partly obsolete.

I was very much impressed with this work! But, to show how even transformers are not perfect, yet, I present you with my favorite error from the paper:

input: LOCATION = Zwaag
output: Pal gelegen achter het centraal station Zwaaijdijk!

Hope to catch you sometime in Zwaaijdijk!

RecSys in HR 2022 Workshop Recording available

We have published the full recording of our RecSys in HR 2022 workshop, which we held September 22 in Seattle, WA, USA.

The video is 5h42m43s long, so to guide you, I provide you the following list of highlights (see the video description for timestamps that will allow you to instantly skip to the sections described below):

1️⃣ Our first keynote speaker, Robyn Rap, a data science leader at Indeed.com talks in depth about the importance of collaboration between #UX and Data Scientists in evaluating and developing search and recommendation systems. She provides a great (broad) overview of the challenges and differences of doing recsys in HR, compared to more common scenarios such as e-commerce or media. Great introduction into our deep field!

2️⃣ The panel, which includes Randstad’s Helen HulskerCarlos Castillo (ChaTo)Liangjie Hong (director of AI, engineering at LinkedIn) and the aforementioned Robyn Rap (still Indeed.com). The topics discussed by these experts: the role of HR Tech in the Global Labor Shortage, fair AI in Practice, multi-stakeholder development of HR Tech, and Regulation and Accountability.

3️⃣ Our second keynote speaker, Liangjie Hong, presents some of the foundational engineering work at LinkedIn that aims to serve many downstream AI applications, which revolves around a pipeline with (continuously updating) embedding representation for job seekers, jobs, and everything else, which are fused with LinkedIns (huge) Knowledge Graph.

4️⃣ There’s also a bunch of interesting paper presentations, e.g., a bunch from Indeed.com: Model Threshold Optimization for Segmented Job-Jobseeker Recommendation System (where the authors show a sneakpeek in their overall setup of recommendations at Indeed.com), Flexible Job Classification with Zero-Shot Learning by thomas lake, which shows how to use off-the-shelf transformer models for doing job classification. And Beyond human-in-the-loop: scaling occupation taxonomy at Indeed: where the authors show how they combine human intelligence with automation for scaling taxonomies across languages and markets. Finally, some interesting and very pragmatic/hands-on papers on skill extraction, e.g., Mike Zhang‘s Skill Extraction from Job Postings using Weak Supervision and Jens-Joris Decorte‘s Design of Negative Sampling Strategies for Distantly Supervised Skill Extraction.

Enjoy watching!

The 2nd RecSys in HR Workshop

And that’s a wrap! Yesterday we had our RecSys in HR Workshop at the ACM RecSys 2022 conference.

👉 I was happy to (virtually) chair the workshop’s panel (with Helen HulskerCarlos Castillo (ChaTo)Liangjie Hong, and Robyn Rap). And hope the rest of the panel will take Helen’s suggestion to heart and book a ☕️ meeting with their lawyer colleagues soon to discuss matters of privacy, compliance, and ethics in the context of AI in HR 😉.
👍 I was impressed by the thorough infrastructures and shared/reusable job and job seeker representations that serve as a foundational component for many downstream products at LinkedIn (as told by Liangjie Hong during his keynote),
👏 Inspired by the strong ties between UX Research and Data Science at Indeed.com as shared by Robyn Rap in her keynote
💪 Proud for seeing our former interns Adam Mehdi Arafan and Roan Schellingerhout present their master theses at the workshop — work that came out of their internships with us at Randstad Groep Nederland!

Many thanks to my co-organizers, in particular the local Seattle team Chris Johnson and Toine Bogers, but also my remote fellows Mesut Kaya and Sepideh Mesbah for pulling an all-nighter 🌛.

Who knows, perhaps we meet again in Singapore next year 😁 (#RecSys2023).

Panelist on ‘AI and Ethics’ at Reshaping Work 2022

I’m excited to be joining a panel on “AI and Ethics” at the Reshaping Work 2022 Conference on October 14 at de Rode Hoed in Amsterdam, alongside Robert Seamans (New York University) and Anastasia Sergeeva (Vrije University Amsterdam). The panel will be chaired by Ting Li (Erasmus University Rotterdam).

Three papers accepted at RecSys in HR 2022 Workshop

🎉 A little success to share: three of our former data science interns at the Data Science chapter at Randstad Groep Nederland have written and published their master theses at our upcoming RecSys in HR Workshop; an academic workshop that revolves around AI in HR, which is part of an ACM International Conference on Recommender Systems (the AI systems used for matching; whether it is Netflix movies to users, or in our case; jobs to job seekers).

As always, the work of the students is pretty technical, but I will go ahead and try to provide little human-understandable summaries below.

Explainable Career Path Predictions using Neural Models

Roan Schellingerhout worked under supervision of Volodymyr Medentsiy on Explainable Career Path Prediction using Neural Networks, where he trained deep neural networks on our own talent work history data, to create a tool that can help consultants or talents to predict possible career switches, given as input a talent’s work history. The predictions are visually explained, in the sense that the underlying reasons for proposing a certain job are provided. Roan tested these visualizations on consultants, and found consultants generally like them.

End-to-End Bias Mitigation in Candidate Recommender Systems with Fairness Gates

  • [PDF] A. M. Arafan, D. Graus, F. P. Santos, and E. Beauxis-Aussalet, “End-to-end bias mitigation in candidate recommender systems with fairness gates,” in Recsys in hr’22: the 2\textsuperscriptnd workshop on recommender systems for human resources, 2022.
    [Bibtex]
    @inproceedings{arafan2022end,
    author = {Arafan, Adam Mehdi and Graus, David and Santos, Fernando P. and Beauxis-Aussalet, Emma},
    title = {End-to-End Bias Mitigation in Candidate Recommender Systems with Fairness Gates},
    year = {2022},
    booktitle = {RecSys in HR’22: The 2\textsuperscript{nd} Workshop on Recommender Systems for Human Resources},
    numpages = {8},
    location = {Seattle, WA, USA and Online},
    series = {CEUR Workshop Proceedings},
    url = {https://ceur-ws.org/Vol-3218/RecSysHR2022-paper_6.pdf},
    month={9}
    }

Adam Arafan worked under supervision of myself on “End-to-End Bias Mitigation in Candidate Recommender Systems with Fairness Gates,” in his thesis he experimented with making the SmartMatch Talent Recommender more fair (at the level of gender), either by changing the “input” of the algorithm (for example; by balancing male and female candidates in the training data), or by changing its “output” (for example: for a given list of candidates, go through the list to make sure the top 10 has a 50/50 balance between male and female candidates). His work is novel because these type of “bias mitigation” strategies have been studied in isolation, but never together.

Automated Personnel Scheduling with Reinforcement Learning and Graph Neural Networks

  • [PDF] B. Platten, M. Macfarlane, D. Graus, and S. Mesbah, “Automated personnel scheduling with reinforcement learning and graph neural networks,” in Recsys in hr’22: the 2\textsuperscriptnd workshop on recommender systems for human resources, 2022.
    [Bibtex]
    @inproceedings{platten2022automated,
    author = {Platten, Benjamin and Macfarlane, Matthew and Graus, David and Mesbah, Sepideh},
    title = {Automated Personnel Scheduling with Reinforcement Learning and Graph Neural Networks},
    year = {2022},
    booktitle = {RecSys in HR’22: The 2\textsuperscript{nd} Workshop on Recommender Systems for Human Resources},
    numpages = {10},
    location = {Seattle, WA, USA and Online},
    url = {https://ceur-ws.org/Vol-3218/RecSysHR2022-paper_1.pdf},
    series = {CEUR Workshop Proceedings},
    month={9}
    }

Ben Platten worked under supervision of Sepideh Mesbah on Automated Personnel Scheduling with Reinforcement Learning and Graph Neural Networks, in which he experimented with “reinforcement learning” (a specific machine learning paradigm) which in theory suits the challenging task of scheduling well. He experimented on a toy problem to assess that, indeed, the method seems to work quite well.

See the full list of accepted papers here: https://recsyshr.aau.dk/accepted-papers/.

And stay tuned for the pre-prints, which I’ll share as soon as they’re available!

Another cohort of Data Science students finished

Very proud of the latest cohort of Data Science thesis interns at Randstad Groep Nederland. In absence of a “real” defense at the University of Amsterdam, we organized our own afternoon packed with defenses (and subsequent drinks) in our Randstad HQ in Diemen. At the end of the afternoon we were able to congratulate Roan, Anna, and Adam on a job (almost) well done!

Roan Schellingerhout presented his work on “Explainable career Path Predictions.” Roan implemented explainable deep neural nets for predicting and explaining a job seekers’ next opportunity, given their previous. He evaluated the models intrinsically, in addition to testing them (+ their explanations) with actual recruiters, and found both that models are accurate and recruiters like and understand them.

Anna Lőrincz worked on data-to-text generation, and fine-tuned a multilingual transformer model for generating benefits (salary, contract, working hours, locations) in job descriptions in both Dutch and English, given structured information (numeric, categorical, and binary variables). She found that transformers can successfully generate fluent and correct text given structured inputs, confirmed that inputs or prompts have a high impact on performance, and found that her approach beats template-based methods in textual diversity. She also found a few very funny hallucinated work locations (“pal achter centraal station in Zwaaijdijk”, was one of our favorites), and found that transformer models tend to sometimes correct output (adjusting a 3k/hour salary into a 3k monthly salary).

Finally, Adam Mehdi Arafan presented his “Double Fair-Gated Bias Mitigation Pipeline” for our Talent Recommender system, where he studied bias in multiple parts of our recsys pipeline, from re-balancing training data (to simulate both balaned and highly imbalanced scenarios), to generating additional balanced synthetic data, and re-ranking outputs. Turns out applying synthetic data does not only help in creating more fair rankers, but can also have benefits in terms of model accuracy!

All three students did great jobs, stay tuned for their thesises (and, who knows, publications? 😏)

Panel on AI for a more inclusive labor market

At a conference in Utrecht I participated in a panel discussion alongside Hans de Zwart (HvA’s Centre of Expertise Applied AI), Rina Joosten-Rabou (Seedlink), Mildo van Staden (Ministerie van Binnenlandse Zaken en Koninkrijksrelaties) and Siri Beerends (SETUP).

I summarized my takeaways on LinkedIn, which I share below (in Dutch)

Continue reading “Panel on AI for a more inclusive labor market”

RecSys in HR at ACM RecSys 2022 in Seattle!

Fantastic news! We’ve received word the 2nd edition of our “Recommender Systems for Human Resources” (RecSys in HR) Workshop has been accepted to be included in the ACM RecSys 2022 program, to be held in Seattle!

Last year’s (first) edition of our workshop was co-located with ACM RecSys 2021 in Amsterdam, and featured two keynotes, a panel, breakout sessions and 8 paper presentations. The recording, workshop proceedings, and a workshop report are available through our workshop’s website at: https://recsyshr2021.aau.dk/

Check back there soon for information on the 2022 edition we’re planning with Toine Bogers, Mesut Kaya, Francisco Gutiérrez, and newly joined co-organizers Sepideh Mesbah (Randstad Groep Nederland) and Chris Johnson (Indeed.com)!

At KINTalks on AI in HR

I’m giving a talk at the KINTalks series, organized by the KIN Center for Digital Innovation on March 25. It’s going to be a hybrid event, so happy to meet you at the VU Amsterdam, and if not, see you online! RSVP here on EventBrite.

KINTalks is a hybrid event where practitioners are invited to talk about their work experience regarding innovation and digital technology.

The blurb

At Randstad, the global leader in the HR services industry, searching and matching is at the heart of what we do. Being founded in 1960, We know from our heritage that real connections are not made from data and algorithms alone – they require human involvement. Last year, we helped more than two million job seekers find a meaningful job by combining industry-scale recommender and search systems with our distinct human touch. While many opportunities exist, employing AI in recruitment and HR is considered high-risk by the European Commission’s proposed regulatory framework on AI, which will bring additional requirements, obligations, and constraints.

In this hybrid talk, I will explain some of the characteristics of, challenges, and opportunities in the HR domain from an AI perspective. I will share some of our own work in recommendations, algorithmic matching, algorithmic bias and knowledge graphs, and highlight some of the ongoing research in this domain.

Two papers accepted at CompJobs ’22

We have two papers accepted at “The First International Workshop on Computational Jobs Marketplace“, co-located with WSDM 2022. Both papers are based on work done by two of our former thesis interns at Randstad Groep Nederland!

  • [PDF] N. Vermeer, V. Provatorova, D. Graus, T. Rajapakse, and S. Mesbah, “Using robbert and extreme multi-label classification to extract implicit and explicit skills from dutch job descriptions,” in Compjobs ’22: the first international workshop on computational jobs marketplace, 2022.
    [Bibtex]
    @inproceedings{vermeer2022using,
    author = {Vermeer, Ninande and Provatorova, Vera and Graus, David and Rajapakse, Thilina and Mesbah, Sepideh},
    title = {Using RobBERT and eXtreme Multi-Label Classification to Extract Implicit and Explicit Skills From Dutch Job Descriptions},
    year = {2022},
    booktitle = {CompJobs '22: The First International Workshop on Computational Jobs Marketplace},
    numpages = {5},
    location = {Online},
    month={2}
    }

☝️ Ninande Vermeer worked under supervision of Sepideh Mesbah and Vera Provatorova (UvA) on: “Using RobBERT and eXtreme Multi-Label Classification to Extract Implicit and Explicit Skills From Dutch Job Descriptions” in which we study to what extent a RobBERT-XMLC model can be used to extract explicit and implicit skills from Dutch job descriptions.

  • [PDF] S. van Els, D. Graus, and E. Beauxis-Aussalet, “Improving fairness assessments with synthetic data: a practical use case with a recommender system for human resources,” in Compjobs ’22: the first international workshop on computational jobs marketplace, 2022.
    [Bibtex]
    @inproceedings{vanels2022improving,
    author = {van Els, Sarah-Jane and Graus, David and Beauxis-Aussalet, Emma},
    title = {Improving Fairness Assessments with Synthetic Data: a Practical Use Case with a Recommender System for Human Resources},
    year = {2022},
    booktitle = {CompJobs '22: The First International Workshop on Computational Jobs Marketplace},
    numpages = {5},
    location = {Online},
    month={2}
    }

✌️ Sarah-Jane van Els worked under supervision of myself and Emma Beauxis-Aussalet (Civic AI Lab) on “Improving Fairness Assessments with Synthetic Data: a Practical Use Case with a Recommender System for Human Resources” in which we explore approaches and methods for assessing algorithmic bias by using synthetic data to improve the size and representativity of a test set used for training candidate recommender systems.

👏 Proud of our former interns for having published their work! And happy with the collaborations we have had with our co-authors 😁.

Panel on AI and Inclusivity in the Labor Market

Update: unfortunately but understandably (in light of the pandemic), this conference has been post-poned until further notice.

December 9th I will join a panel discussion on AI and equal opportunities in the labor market (together with Siri Beerends & Rina Joosten-Rabou), at a conference organized by WOMEN INC. An offline event in Utrecht (Corona volente)!

More details and RSVP (in Dutch) here: Congres | “Hey Siri: Vind een geschikte kandidaat”