Below is an article I wrote with Maarten de Rijke, which was published in nrc.next and NRC Handelsblad under a somewhat misleading title (which wasn’t ours). I cleaned up a Google Translate translation of this article. The translation is far from perfect, but I believe gets the main point across. You can read the original article in Blendle (for €0.29) or on NRC.nl (for free).
A google image search for “three black teens” resulted in mugshot photos, while a search for “three white teens” yielded stock photos of happy smiling youth. Commotion everywhere, and not for the first time. The alleged lack of neutrality of algorithms is a controversial topic. In this controversy, the voice of computer scientists is hardly ever heard. And to have a meaningful discussion on the topic, it is important to understand the underlying technologies.
Our contention, as computer scientists: the lack of neutrality is both necessary and desirable. It is what enables search and recommendation systems to provide us access to huge amounts of information, and let us discover new music or movies. With objective, neutral algorithms, we wouldn’t be able to find anything anymore.
There’s two reasons for this. First, the “usefulness” of information is personal and context-dependent. The quality of a movie recommendation from Netflix, the interestingness of a Facebook post, even the usefulness of a Google search result, varies per person and context. Without contextual information, such as user location, time, or the task performed by the user, even experts do not reach agreement on the usefulness of a search result.
Second, search and recommendation systems have to give us access to enormous quantities of information. Deciding what (not) to display, the filtering of information, is a necessity. The alternative would be a “Facebook” which shows thousands of new messages every single day, making each new visit show a completely new deluge of posts. Or a Netflix which recommends only random movies, so that you can no longer find the movies you really care about.
In short, search and recommendation systems have to be subjective, context-dependent, and adapted to ourselves. They learn this subjectivity and lack of neutrality, from us, their users. The results of these systems are thus a reflection of ourselves, our preferences, attitudes, opinions and behavior. Never an absolute truth.
The idea of an algorithm as a static set of instructions carried out by a machine is misleading. In the context of, for example, Facebook’s news feed, Google’s search results or Netflix’ recommendation, a machine is not told what to do, but told to learn what to do. The systems learn from subjective sources: ourselves, our preferences, our interaction behavior. Learning from subjective sources naturally yields subjective outcomes.
To choose what results to show, a search and recommendation system learns to predict the user’s preferences or taste. To do this, it does what computers do best: counting things. By keeping track of the likes a post receives, or the post’s reading time, the system is able to measure various characteristics of a post. Likes or reading-time are just two examples: in reality, hundreds of attributes are included.
To then learn what is useful for an individual user, the system must determine which features of posts the user considers important. Essential here is to determine the effectiveness of the information displayed. For this, the system gets a goal, such as making sure the user spends more time on the site.
By showing messages with different characteristics (more or less likes, longer or shorter reading times), and to keep track of how long or often the user visits the site, the system can learn which message characteristics makes people spend more time on the website. Things that are simple to measure (clicks, likes, or reading time) are used to bring about more profound changes in user behavior (long term engagement). Furthermore, research has shown that following the personalized recommendations eventually leads to a wider range of choices, and a higher appreciation of the consumed content for users.
The success of modern search and recommendation systems largely results from their lack of neutrality. We should consider these systems as “personalized information intermediaries.” Just like traditional intermediaries (journalists, doctors, opinion leaders), they provide a point of view by filtering and ranking information. And just like traditional intermediaries, it would be wise to seek a second or third opinion when it really matters.