Internships and MSc. projects at Randstad Groep Nederland

About Randstad

Work with impact. At Randstad Groep Nederland IT you keep the country moving, enabling people across sectors to do their work, getting pizza on your table and your suitcase on the plane. Your AI solutions mean tomorrow’s recruiter is smarter and faster but still embodies our human forward approach, combining tech with a personal touch and putting people first – including you. Constantly experimenting, working on new NLP use cases and matching systems or expanding our self-service data platform. If you bring the idea we will provide the freedom to explore, so you can help us shape the world of work.

Data Science @ RGN

Randstad IT is organized in a variation of the Spotify Engineering Model with squads, tribes, and chapters. Our data science chapter spans 12 data scientists, data engineers and machine learning engineers over 3 departments (IT, finance, and marketing), across 6 different teams. These teams work on recommender systems for algorithmic job matching, natural language processing and information extraction, forecasting, and more. We are further interested in AI fairness and auditing, explainability, and transparency.

Who are you?

We’re looking for students studying AI, data science, or related programs, for either graduation projects or regular internships. Fluency in python is required, and we expect our interns to work autonomously. However, as an intern you’ll be a fully fledged member of our chapter, which means you get to benefit from the knowledge that is being shared in our chapter.

Here’s the overview of our suggested projects:

(Deep) Reinforcement Learning-based Planning & Poolmanagement
Writing style transfer learning
Career pathing MVP
Pairwise learning to rank for SmartMatch
Revenue forecasting using time-series algorithms
Structured information extraction from resumes
Salary parsing from vacancies
Record linkage for company linking
Free text notes and comments for improved job matching

This is a list of tentative projects, by no means exhaustive. If there’s anything that you find interesting or you have a specific idea, do not hesitate to reach out to me at david.graus@randstadgroep.nl!

Projects

(Deep) Reinforcement Learning-based Planning & Poolmanagement

Forget ancient boardgames, and solve real world problems for real people: AlphaGo, but for automated planning! At Randstad we schedule over 63,000 employees into over 170,000 shifts every week. Our finalized schedules are the end-state of a complex planning game, where we map our pool of employees into shifts, with complex and multiple constraints and rules at the employee, shift, and legal levels.

For this project, we would like to explore the potential for applying reinforcement learning methods to learn how to plan, leveraging our rich set of historic plannings and variables and available attributes and data variables associated with our employees and shifts. We’re looking for students that are interested in (deep) reinforcement learning and want to explore how approaches that have proved successful in chess, go, or other games to our planning challenge.

Writing style transfer learning

In an effort to improve the textual quality of vacancies written by our consultants, we’re looking for ways in which ML can provide guidance or checks of the textual content. One potential avenue is looking into text generation algorithms like GPT-2 to generate vacancy texts, where the final output layer is specifically trained for the brand (Tempo-Team, Randstad, Yacht, BMC), resulting in a vacancy text with the same purpose and information, though written to the style of the brand.

We’re looking for students with an interest in natural language processing, deep learning, and specifically transformer models.

Career pathing MVP

Part of randstad’s activities is commitment is to help our talents grow. We have rich longitudinal data points in our databases, i.e., candidates that have been with us for many months or years, growing over time from one role to another. By aggregated all these career paths, interesting patterns may emerge.

For this project we are looking to unleash our longitudinal data on career paths. Projects can include predicting the most common next role given a current, or building tools or MVPs that help job seekers plan their career in steps, by, e.g., including transition probabilities between roles, or shortest path-finding.

Pairwise learning to rank for SmartMatch

At randstad we have a rich set of feedback around candidates’ application journeys, ranging from clicks on profiles by recruiters, to candidate selections, candidates who applied to vacancies themselves, candidates that are interviewed or assessed by clients, all the way to the actual placements and rejections. These rich feedback signals motivate us to study more closely on how to optimize our matching algorithms, e.g., by considering the sequential or ordinal nature of the different labels, relative preferences, to distinguishing the short-term from the long-term effects.

For this project we’re looking for a student with interest in learning to rank models and recommender systems, learning from implicit feedback, and more generally information retrieval.

Revenue forecasting using time-series algorithms

We currently have forecasting models in production that leverage SARIMAX models, but with the rise of e.g., deep learning for time series forecasting, and the release of libraries and packages such tslearn, we are curious to understand if there are alternative algorithms that perform better than the current SARIMAX? We are looking for students with experience with Python and an interest in timeseries modeling and machine learning.

Structured information extraction from resumes

Part of our matching algorithms leverage unstructured data. To reliably extract machine readable information from resumes, we aim to study methods for segmenting resumes. For this project, we’re looking for students interested in recent advances around deep learning (see, e.g., Google’s “Extracting Structured Data from Templatic Documents“) and computer vision, to develop clever algorithms for resume segmentation.

Salary parsing from vacancies

At randstad we have several datasets consisting of vacancy texts, currently mostly stored in full-text unstructured format. In our mission to extract more meaningful and structured information from these vacancies, e.g., skills and job titles, we also would like to reliably extract salaries and salary ranges from vacancy texts.

We’re looking for students interested in natural language processing, and information and entity extraction to train custom salary extraction methods on hundreds of thousands of vacancy texts.

Record linkage for company linking

For different products at Randstad we need to link company data (e.g., from our internal database) to a canonical company database. This task of entity normalization or record linkage has been studied extensively, and has many challenges from scale (typically dealing with a combinatorial explosion when having to perform pairwise comparisons between millions of entries) to innovative and smart ways to compute similarity between different types of fields (categorical, real-valued, full-text).

We’re looking for students who would like to sink their teeth into this problem that combines big data with clever feature engineering.

Free text notes and comments for improved job matching

As part of randstad’s matching activities, our recruiters speak to many candidates and clients. In their workflow, they may log notes and comments around vacancies and candidates as additional metadata, that is currently unused. This data is of a challenging nature; freeform text jotted down by recruiters during work, means the structure, spelling, and format differs across authors and individual notes.

We’re looking for students interested in natural language processing to unleash value from these freeform notes, e.g., through extracting structured information (entities, locations, company names), devising novel feature representations, or applying classification or topic modeling methods to improve searching or algorithmic matching.