📅 November 1, 2022 • 🕐 09:08 • 🏷 Papers and Research • 👁 54

Anna Lőrincz‘ UvA MSc. data science thesis “Transfer learning for multilingual vacancy text generation” — which was graded a 9/10 💫 — was recently accepted at the The Second Version of Generation, Evaluation & Metrics (GEM) Workshop 2022 which will be held as part of EMNLP, December 7-11, 2022!

Get the pre-print here:

  • [PDF] A. Lőrincz, D. Graus, D. Lavi, and J. L. M. Pereira, “Transfer learning for multilingual vacancy text generation,” in Proceedings of the 2nd workshop on natural language generation, evaluation, and metrics (gem 2022), 2022.
    author = {L{\H{o}}rincz, Anna and Graus, David and Lavi, Dor and Pereira, Jo{\~a}o L. M.},
    title = {Transfer Learning for Multilingual Vacancy Text Generation},
    year = {2022},
    booktitle = {Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2022)},
    location = {Online},
    publisher = {Association for Computational Linguistics},

In her work, Anna explores transformer models for data-to-text generation, or more specifically: given structured inputs such as categorical features (e.g., location), real valued features (e.g., salary of hours of work per week), or binary features (e.g., contract type) that represent benefits of vacancy texts, the task is to generate a natural language snippet that expresses said feature.

Layout of benefit section from Randstad.nl

Anna finds that using transformers greatly increases (vocabulary) variation when compared to template-based models, and needs less human effort. The results were — to me — surprisingly good, another proof that transformers are taking over the world and making traditional NLP methods partly obsolete.

I was very much impressed with this work! But, to show how even transformers are not perfect, yet, I present you with my favorite error from the paper:

input: LOCATION = Zwaag
output: Pal gelegen achter het centraal station Zwaaijdijk!

Hope to catch you sometime in Zwaaijdijk!