Improving automated segmentation of radio shows with audio embeddings published @ IEEE ICASSP2020

Oberon Berlage’s MSc. thesis: “Improving automated segmentation of radio shows with audio embeddings” which he wrote under my supervision during his internship at FD Mediagroep was awarded a 9/10, under condition that the work was publishable.

Turns out it was, as it was recently accepted at IEEE ICASSP2020 (the 45th International Conference on Acoustics, Speech, and Signal Processing) without any additional work/experiments (just a bit of reduction). But you already knew this… Oberon will be presenting this work in Barcelona, thanks to the generous support of UvA’s Information Studies program.

We now published a preprint, read it below:

  • [PDF] [DOI] O. Berlage, K. Lux, and D. Graus, “Improving automated segmentation of radio shows with audio embeddings,” in Icassp 2020 – 2020 ieee international conference on acoustics, speech and signal processing (icassp), 2020, pp. 751-755.
    author={O. {Berlage} and K. {Lux} and D. {Graus}},
    booktitle={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    title={Improving Automated Segmentation of Radio Shows with Audio Embeddings},

His work revolved around improving BNR SMART Radio‘s text-based segmentation by incorporating audio signals in the form of audio embeddings. This turns out to improve over our text-based baseline by a whopping +32.3% F1-measure!

Even better: an audio-only approach, trained on a smallish openly available dataset, outperforms our text-only baseline by 9.4%. This means the segmentation method can be employed without need for audio transcription, which could be a money-saver.

Leave a Reply