It was great fun to geek out with a knowledgeable host on anything recsys in HR, we touch on fairness, difficulties with delayed feedback signals, our RecSys in HR Workshop, and how I discover obscure rap music (OK, that’s not HR).
Yesterday evening I was a guest next to Dasha Simons (IBM) and Oumaima Hajri (Hogeschool Rotterdam) in Felix Meritis at the invitation of Laurens Vreekamp to talk about — among other things — fair AI, ethics, and algorithmic bias.
It was nice and important to be able to discuss these topics from an optimistic and pragmatic perspective, because algorithms — when designed thoughtfully — offer so much more than scary filter bubbles or addictive “social dilemmas.”
See for more details: https://felixmeritis.nl/programma/ai-for-good-2-inclusiviteit-veiligheid/
Anna Lőrincz‘ UvA MSc. data science thesis “Transfer learning for multilingual vacancy text generation” — which was graded a 9/10 💫 — was recently accepted at the The Second Version of Generation, Evaluation & Metrics (GEM) Workshop 2022 which will be held as part of EMNLP, December 7-11, 2022!
Get the pre-print here:
[bibtex file=citations.bib key=lorincz2022transfer]
In her work, Anna explores transformer models for data-to-text generation, or more specifically: given structured inputs such as categorical features (e.g., location), real valued features (e.g., salary of hours of work per week), or binary features (e.g., contract type) that represent benefits of vacancy texts, the task is to generate a natural language snippet that expresses said feature.
Anna finds that using transformers greatly increases (vocabulary) variation when compared to template-based models, and needs less human effort. The results were — to me — surprisingly good, another proof that transformers are taking over the world and making traditional NLP methods partly obsolete.
I was very much impressed with this work! But, to show how even transformers are not perfect, yet, I present you with my favorite error from the paper:
input: LOCATION = Zwaag output: Pal gelegen achter het centraal station Zwaaijdijk!
Hope to catch you sometime in Zwaaijdijk!
We have published the full recording of our RecSys in HR 2022 workshop, which we held September 22 in Seattle, WA, USA.
The video is 5h42m43s long, so to guide you, I provide you the following list of highlights (see the video description for timestamps that will allow you to instantly skip to the sections described below):
1️⃣ Our first keynote speaker, Robyn Rap, a data science leader at Indeed.com talks in depth about the importance of collaboration between #UX and Data Scientists in evaluating and developing search and recommendation systems. She provides a great (broad) overview of the challenges and differences of doing recsys in HR, compared to more common scenarios such as e-commerce or media. Great introduction into our deep field!
2️⃣ The panel, which includes Randstad’s Helen Hulsker, Carlos Castillo (ChaTo), Liangjie Hong (director of AI, engineering at LinkedIn) and the aforementioned Robyn Rap (still Indeed.com). The topics discussed by these experts: the role of HR Tech in the Global Labor Shortage, fair AI in Practice, multi-stakeholder development of HR Tech, and Regulation and Accountability.
3️⃣ Our second keynote speaker, Liangjie Hong, presents some of the foundational engineering work at LinkedIn that aims to serve many downstream AI applications, which revolves around a pipeline with (continuously updating) embedding representation for job seekers, jobs, and everything else, which are fused with LinkedIns (huge) Knowledge Graph.
4️⃣ There’s also a bunch of interesting paper presentations, e.g., a bunch from Indeed.com: Model Threshold Optimization for Segmented Job-Jobseeker Recommendation System (where the authors show a sneakpeek in their overall setup of recommendations at Indeed.com), Flexible Job Classification with Zero-Shot Learning by thomas lake, which shows how to use off-the-shelf transformer models for doing job classification. And Beyond human-in-the-loop: scaling occupation taxonomy at Indeed: where the authors show how they combine human intelligence with automation for scaling taxonomies across languages and markets. Finally, some interesting and very pragmatic/hands-on papers on skill extraction, e.g., Mike Zhang‘s Skill Extraction from Job Postings using Weak Supervision and Jens-Joris Decorte‘s Design of Negative Sampling Strategies for Distantly Supervised Skill Extraction.
And that’s a wrap! Yesterday we had our RecSys in HR Workshop at the ACM RecSys 2022 conference.
👉 I was happy to (virtually) chair the workshop’s panel (with Helen Hulsker, Carlos Castillo (ChaTo), Liangjie Hong, and Robyn Rap). And hope the rest of the panel will take Helen’s suggestion to heart and book a ☕️ meeting with their lawyer colleagues soon to discuss matters of privacy, compliance, and ethics in the context of AI in HR 😉.
👍 I was impressed by the thorough infrastructures and shared/reusable job and job seeker representations that serve as a foundational component for many downstream products at LinkedIn (as told by Liangjie Hong during his keynote),
👏 Inspired by the strong ties between UX Research and Data Science at Indeed.com as shared by Robyn Rap in her keynote
💪 Proud for seeing our former interns Adam Mehdi Arafan and Roan Schellingerhout present their master theses at the workshop — work that came out of their internships with us at Randstad Groep Nederland!
Many thanks to my co-organizers, in particular the local Seattle team Chris Johnson and Toine Bogers, but also my remote fellows Mesut Kaya and Sepideh Mesbah for pulling an all-nighter 🌛.
Who knows, perhaps we meet again in Singapore next year 😁 (#RecSys2023).
I’m excited to be joining a panel on “AI and Ethics” at the Reshaping Work 2022 Conference on October 14 at de Rode Hoed in Amsterdam, alongside Robert Seamans (New York University) and Anastasia Sergeeva (Vrije University Amsterdam). The panel will be chaired by Ting Li (Erasmus University Rotterdam).
On Friday, September 23 I am representing Randstad and giving a talk on “The future of work” at the DeLaMar theater, part of the UvA in DeLaMar seminar series.
🎉 A little success to share: three of our former data science interns at the Data Science chapter at Randstad Groep Nederland have written and published their master theses at our upcoming RecSys in HR Workshop; an academic workshop that revolves around AI in HR, which is part of an ACM International Conference on Recommender Systems (the AI systems used for matching; whether it is Netflix movies to users, or in our case; jobs to job seekers).
As always, the work of the students is pretty technical, but I will go ahead and try to provide little human-understandable summaries below.
Explainable Career Path Predictions using Neural Models
Roan Schellingerhout worked under supervision of Volodymyr Medentsiy on Explainable Career Path Prediction using Neural Networks, where he trained deep neural networks on our own talent work history data, to create a tool that can help consultants or talents to predict possible career switches, given as input a talent’s work history. The predictions are visually explained, in the sense that the underlying reasons for proposing a certain job are provided. Roan tested these visualizations on consultants, and found consultants generally like them.
End-to-End Bias Mitigation in Candidate Recommender Systems with Fairness Gates
[bibtex file=citations.bib key=arafan2022end]
Adam Arafan worked under supervision of myself on “End-to-End Bias Mitigation in Candidate Recommender Systems with Fairness Gates,” in his thesis he experimented with making the SmartMatch Talent Recommender more fair (at the level of gender), either by changing the “input” of the algorithm (for example; by balancing male and female candidates in the training data), or by changing its “output” (for example: for a given list of candidates, go through the list to make sure the top 10 has a 50/50 balance between male and female candidates). His work is novel because these type of “bias mitigation” strategies have been studied in isolation, but never together.
Automated Personnel Scheduling with Reinforcement Learning and Graph Neural Networks
[bibtex file=citations.bib key=platten2022automated]
Ben Platten worked under supervision of Sepideh Mesbah on Automated Personnel Scheduling with Reinforcement Learning and Graph Neural Networks, in which he experimented with “reinforcement learning” (a specific machine learning paradigm) which in theory suits the challenging task of scheduling well. He experimented on a toy problem to assess that, indeed, the method seems to work quite well.
See the full list of accepted papers here: https://recsyshr.aau.dk/accepted-papers/.
And stay tuned for the pre-prints, which I’ll share as soon as they’re available!
Very proud of the latest cohort of Data Science thesis interns at Randstad Groep Nederland. In absence of a “real” defense at the University of Amsterdam, we organized our own afternoon packed with defenses (and subsequent drinks) in our Randstad HQ in Diemen. At the end of the afternoon we were able to congratulate Roan, Anna, and Adam on a job (almost) well done!
Roan Schellingerhout presented his work on “Explainable career Path Predictions.” Roan implemented explainable deep neural nets for predicting and explaining a job seekers’ next opportunity, given their previous. He evaluated the models intrinsically, in addition to testing them (+ their explanations) with actual recruiters, and found both that models are accurate and recruiters like and understand them.
Anna Lőrincz worked on data-to-text generation, and fine-tuned a multilingual transformer model for generating benefits (salary, contract, working hours, locations) in job descriptions in both Dutch and English, given structured information (numeric, categorical, and binary variables). She found that transformers can successfully generate fluent and correct text given structured inputs, confirmed that inputs or prompts have a high impact on performance, and found that her approach beats template-based methods in textual diversity. She also found a few very funny hallucinated work locations (“pal achter centraal station in Zwaaijdijk”, was one of our favorites), and found that transformer models tend to sometimes correct output (adjusting a 3k/hour salary into a 3k monthly salary).
Finally, Adam Mehdi Arafan presented his “Double Fair-Gated Bias Mitigation Pipeline” for our Talent Recommender system, where he studied bias in multiple parts of our recsys pipeline, from re-balancing training data (to simulate both balaned and highly imbalanced scenarios), to generating additional balanced synthetic data, and re-ranking outputs. Turns out applying synthetic data does not only help in creating more fair rankers, but can also have benefits in terms of model accuracy!
All three students did great jobs, stay tuned for their thesises (and, who knows, publications? 😏)
At a conference in Utrecht I participated in a panel discussion alongside Hans de Zwart (HvA’s Centre of Expertise Applied AI), Rina Joosten-Rabou (Seedlink), Mildo van Staden (Ministerie van Binnenlandse Zaken en Koninkrijksrelaties) and Siri Beerends (SETUP).
I summarized my takeaways on LinkedIn, which I share below (in Dutch)plop