Algorithms replicate human behaviour and understanding
Algorithms have flooded the Internet by incorporating predictions and taking decisions in the place of humans in most aspects of our daily life. A n algorithm can review resumes to mechanise recruiting and assign value to candidates based on their qualities. It can also individualise and target advertising online for certain segments of people with the same interests. They all have in common that they produce results on the basis of a dataset which it has processed and incorporated into decision-making, be that to facilitate a human decision later or be that to automate a task mimicking human behaviour and thinking.
Some algorithms, namely through Facial Recognition Technology (FRT), use physical appearance as a proxy to decide on the future performance of potential candidates at the interview stage, to detect sexuality or to predict criminality. Harmful impacts stemming from the implementation of Artificial Intelligence (AI) into some of these fields have manifested in the form of discrimination with detriment to particular communities, i.e., women, black and/or LGBTQ+. These ripple effects into broader society have been constantly identified by literature as a purely computational error (statistical or computational biases), which can be addressed as a ‘bug’ within the larger system. In other words, these mistakes associated to algorithms have perpetuated the belief that they can be detected locally and in a single place. However, human and systemic biases incorporated into the technology are constantly overlooked insofar as they cannot be solved as easily, irrespective of the fact that they are the main source of unfairness produced by algorithms.
Stemming from this line of thought, algorithms replace humans and human processes. On their way, they incorporate the reactions which humans would perform when faced with a same set of data, just at a much bigger speed. Hence, the problem with bias in AI does not stem from a deviation of the technology from the human behaviour it is expected to perform. In fact, it replicates a fundamental part of the human mind and nature: decision-making.
Good intentions pave the road to hell
Going back to the FRT, research has tried to emulate the human brain -and even go further than the conscious mind- to prove sociological and biological assumptions which were plagued with misconceptions. One of the most daring examples of the application of FRT and algorithms was first published in the form of a study in 2017 by two researchers from Stanford University to demonstrate that sexuality could be extracted from a person’s facial characteristics. To do that, they used a sample of an online dating website and the photos which users had already uploaded into it to distinguish between gay and straight men and women. As a result, FRT proved that it could identify sexual orientation with an 81% precision for men and a 74% probability for women.
In the words of one of the researchers participating in the report, the results provided evidence that sexual orientation has biological routes, and it is not a lifestyle choice for LGTBQ+ communities. However, the findings experienced a huge backlash from civil right activists and organisations which mainly contested the ethics based on the use of FRT. They were right to do that.
First, massive amounts of photos were analysed and processed through FRT to then enable the algorithm to take a binary decision on an individual, i.e., gay or not gay. Although the study did not disclose the dating website that catered for the images, it is cut clear that the 300,000 photographs that were analysed were, in fact, scraped from the public profiles posted on a U.S. dating website. Thus, no consent was rendered by the users who had uploaded those photos, not in the privacy disclosure they had to agree to when they first logged into the website, and not after that. Second, sexual orientation and sexuality is certainly not something measurable which can be determined with ease. In fact, the study excluded bisexual and transgender people from its scope and did not account for sexual fluidity in the least. The decisions produced by the algorithm set out a clear answer: you are gay, or you are straight, with nothing further in between. Last of all, irrespective of the fact that the study did not provide for the AI which worked out the decisions, it can pose a great risk in terms of its misuse in the wrong hands. Even if the results obtained from the study prove to be technically correct, when used by governments and authorities that prosecute LGBTQ+ communities as criminals, the use of this technology can roll out all kinds of endangering threats to sexual rights.
Specific solutions to rare problem: what is the norm?
So, the question is: what is the solution? To the first problem, consent should have been gathered from the users participating in the dating website. This would, at least, address the problem of data misuse and decontextualization when the dataset is not dedicated to the purpose it was first designed to perform within the dating website. In short, the same images would have been used, with no real impact on the possible bias and misappropriation irradiated by the study. The bias would still stand insofar as the social and cultural context of the images would not be contemplated.
To the second problem, the dataset which was used was not representative at all, insofar as it exhibited entrenched historical and systemic biases, i.e., bisexual and transgender people were not even considered into the AI categorisation, although they could have been incorporated into the sample of images used for the research. FRT mapped static categories (gay vs. not gay) onto diverse facial characteristics and people, with great risks to the dataset’s own accuracy when measuring its success. This problem, named in this post as the debiasing paradox, is quite difficult to address without a profound knowledge of privacy and transparency, insofar as the simple answer would be to add these categories onto the results produced by the AI. In turn, this would further accentuate the risk of misidentification, if it did not come with a bigger and more representative dataset where all sexual orientations and sexualities are contemplated. In other words, more data and images would be needed to increase the sample in order to become more knowledgeable of different sensibilities. It is still unclear whether to conduct invasive data collection on communities that are already marginalized -and, in most cases, rarely acknowledged by society- would ensure that they are represented with accuracy. Once again, a technical solution -and an aggressive practice in data collection, for that- would not put an end to the bias against sexual fluidity as well as to the projected risks against sexuality as a binary manifestation.
To the third problem, risk avoidance has normally been the solution. For instance, do not pass on (or even buy off) the information and processes the technology used to produce results on whether a person is gay or not to governments and authorities which will pose heavy interferences with fundamental rights. However, with regards to this last problem, and in relation to the application of the technology in general, risk mitigation is seldom considered, and it should be, especially in the form of risk assessments.
In this sense, we must ask ourselves whether certain systems should be designed at all, and whether they are necessary to demonstrate the effective and accurate functioning of algorithms and FRT beyond human perception and nature if they do, in fact, give grounds to speculation and further reason for prejudice and intolerance in relation to particularly vulnerable communities.
 Based on the research findings in ‘Discriminating Systems: Gender, Race and Power in AI’ and ‘Towards a Standard for Identifying and Managing Bias in Artificial Intelligence’.
 Data scraping refers to the process of extracting data from a digital source for automated replication, formatting or manipulation by a computer program.