Automation can make public decision-making quicker, faster, and cheaper, and in the UK, we are seeing their use proliferate across Government departments. Automated Decision Making (ADM), however, also comes with significant risks and drawbacks, including a well-documented risk that the use of such systems is having a disproportionate impact on already marginalised individuals, groups, and communities. Furthermore, there is no systemic, public information about how and why public bodies use ADM systems or how such systems impact affected groups.
To have trust in ADM systems, we need transparency. We must be able to see that they are working in a reliable, lawful, and non-discriminatory way. This is not the experience in the UK. Here we are witnessing an approach of secrecy-by-default in the development and deployment of ADM. That’s why the Public Law Project (PLP) has launched the Tracking Automated Government (TAG) Register, an open-source database that will monitor, document and analyse the use of ADM tools by government bodies and highlight where there is a risk that they could lead to biased outcomes.
ADM systems pose a discrimination risk
ADM systems can lead to discrimination in various ways. Three examples from the TAG register underline that discrimination risk can come from: biased feedback loops; unrepresentative training data; and discriminatory design of a system’s rules.
One of the first ADM systems to be deployed by the UK police force in an operational capacity was the Harm Assessment Risk Tool (HART). The tool was developed by statistical experts based at the University of Cambridge in collaboration with Durham Constabulary. HART was used to predict how likely an individual is to commit a violent or non-violent offence over the next two years. Based on the information available to PLP and included in the TAG register, HART was rolled out in 2016 and scrapped in September 2020. HART was constructed using random forests, one of many different forms of machine learning. Random forests allow for the inclusion of a large number of predictors, the use of a variety of different data sources, the expansion of assessments beyond binary outcomes, and taking the costs of different types of forecasting errors into account when constructing a new model. HART gave an individual a risk score of low, medium or high, and was designed to overestimate the risk.
Of the 34 predictors values used in HART, 29 stemmed directly from the individual’s offending history. These were combined with other data including personal characteristics such as age and gender, as well as their postcode. The primary postcode predictor was limited to the first four characters of the postcode, and usually encompassed a large geographic area. Yet, even with this limitation, an academic study noted that this variable risks a kind of feedback loop that may perpetuate or amplify existing patterns of offending. If the police respond to forecasts by targeting their efforts on the highest-risk postcode areas, then more people from these areas will come to police attention and be arrested than those living in lower-risk, untargeted neighbourhoods. These arrests then become outcomes that are used to generate later iterations of the same model, leading to an ever-deepening cycle of increased police attention. This can lead to discrimination as postcodes can be a proxy for race or ethnicity. For example, in a 2016 study, the Human Rights Data Analysis group demonstrated that the use of a similar predictive ADM tool, PredPol, in Oakland, California would reinforce racially biased police practices by recommending increased police deployment in neighbourhoods with higher populations of Black and low-income residents.
Another well-documented pattern of ADM discrimination is where a system “learns” from biased data and reproduces that bias. This effect is often referred to by data scientists as “garbage in, garbage out”. The risk of bias from training data was highlighted by the UK courts in the well-known Bridges case,1R (Bridges) v Chief Constable of South Wales Police and others  EWCA Civ 1058. concerning the South Wales Police’s (SWP) use of facial recognition technology. The evidence in that case showed that due to imbalances in the representation of different groups in the training data, the technology used could be less accurate when it comes to recognising the faces of people of colour and women. The claimant in the case successfully argued that SWP breached the public sector equality duty – a duty under the Equality Act 2010 requiring public bodies to have due regard to the need to eliminate unlawful discrimination – because there was a failure to address indirect discrimination in the assessment of the equality impacts of the tool. The court concluded that SWP had never sought to satisfy themselves that the software programme did not have an unacceptable bias on grounds of race or sex. To put it simply, the underrepresentation in the training data risked the system not recognising Black female faces as accurately as white male faces.
Our final example is where the design of an ADM tool directly discriminates against a group of people. In 2015 the Home Office began using a ‘Visa Streaming Tool’ to automate the risk profiling it undertook of all applications for entrance clearance into the UK. The tool allocated all applicants for entry clearance into one of three streams: red, amber or green. The streaming tool operated in three stages. First was the pre-screening stage, where some nationalities were automatically given a red rating, although that list of nationalities was not publicly disclosed. The tool then matched all other applicants with risk profiles constructed by the Home Office and the tool sorted all applicants into red, amber and green categories based on their individual risk profiles. These risk profiles used three factors: nationality, type of application and location from where the application was made. The ratings determined the level of scrutiny that officials would give to the visa entry applications and the time that was to be spent on each application. Following the initiation of a legal challenge by the civil society organisation Joint Council for the Welfare of Immigrants (JCWI) and Foxglove alleging that the visa streaming tool was directly discriminatory on the basis of nationality, the Home Office suspended its use in August 2020.
Lack of transparency only makes things worse
For every discriminatory tool we are aware of, there will be many more that government has managed to keep under wraps. Virtually all of us will have been subject to ADM at some point. But rarely are we aware of specific instances when they happen. This opacity makes it much more difficult for us to identify, challenge, and obtain redress for the negative effects of discriminatory systems. And it is even more egregious when an ADM system is used in a high-impact, life-altering policy area like policing, immigration, or welfare benefits, with potentially very serious emotional, physical and financial implications for those affected.
We strongly suspect that there are many ADM tools whose existence is as-yet unknown to those outside government. This is because we often find out about ADM tools only after they have been in use for some time and only through requests made under the Freedom of Information Act 2000 (FOIA). For example, one of the tools on the TAG Register is the ‘Windrush compensation scheme model’, used to estimate the cost of providing compensation to the Windrush generation in respect of the difficulties they faced in demonstrating their immigration status. PLP discovered this tool only through a request for reviews of models, data analytics and other services conducted by the Analytical Quality Assurance Board: a Home Office body tasked with assessing the proportionality and robustness of their algorithmic tools.
Admittedly, the Cabinet Office has now been piloting an ‘Algorithmic Recording Transparency Standard’ (ATS), which asks public sector organisations across the UK to provide information about their algorithmic tools. However, engagement with the ATS is voluntary, and it appears that uptake has been very limited. Only six reports have been published to date. Four reports were published in June and July 2022 and two reports were published in October 2022. Nothing since. And yet there are over forty entries on the TAG Register. Moreover, the reports that have been published on the ATS do not concern the kinds of high-stakes and potentially discriminatory tools that PLP knows to exist. We can be sure that, behind closed doors, government is continuing to pilot and deploy new ADM tools, some of which will have the potential to inflict further harm on already-marginalised groups. But the specifics remain secret. And if we do not know that a particular tool exists, we have no hope of assessing whether and how it is discriminatory – let alone challenging its use and obtaining redress.
Moreover, simply knowing that a tool exists is not enough. We also need to know how it works and how it affects different groups. One useful source of information is an equality impact analysis. In the UK, Equality Impact Assessments (EIAs) are often carried out by public bodies to facilitate and evidence compliance with the public sector equality duty. As noted above, the duty requires public bodies to have due regard to the need to eliminate unlawful discrimination. EIAs are sometimes used as an opportunity for public bodies to consider whether any direct or indirect discrimination would be involved in the deployment of an ADM tool. However, in respect of many ADM tools, EIAs are not carried out or, if they are carried out, they are not published. This makes it much more difficult to spot possible discrimination, as we can only rely on anecdotal evidence or on limited statistical evidence collected by civil society post-deployment.
To give an example, the Department for Work and Pensions’ (DWP) 2021-2022 accounts revealed that they had been piloting a new machine learning-driven predictive model to detect possible fraud in Universal Credit claims. The accounts showed that the model was already being used in cases where someone was receiving an advance on their Universal Credit claim and that the DWP was expecting in early 2022-23 to trial the model to stop claims before any payments are made. However, the DWP has not published any EIAs, Data Protection Impact Assessments or other evaluations completed in relation to this new model or its pre-existing ADM tools. This is despite PLP and other civil society organisations having repeatedly made FOIA requests asking for this information.
There is evidence to suggest that some of the DWP’s tools are discriminatory. The Greater Manchester Coalition for Disabled People has collected anecdotal evidence that a high percentage of their group has been flagged for investigation by the DWP’s automated systems.2See Michael Savage, ‘DWP urged to reveal algorithm that ‘targets’ disabled for benefit fraud’ (21 November 2021, The Guardian) and ‘GMCDP & Foxglove Legal Challenge to the Department for Work and Pensions DWP Fraud Algorithm’, GMCDP news page. Yet, when asked by Debbie Abrahams MP at a Work and Pensions Committee meeting on 24 November 2021 on the subject of the 2021-22 Accounts, the DWP was unable to say what proportion of the people being investigated for benefit fraud are disabled. Without any kind of equality impact analysis provided by the DWP, it is extremely difficult to properly understand, prove, or discount any discrimination that may be taking place.
Generally speaking, if an EIA is carried out, this happens early on when a tool is still being developed or piloted. Equally, if not more important, is that ADM tools are continuously monitored and evaluated throughout the period of deployment and that updated reports are routinely published. There are two main reasons for this. First, it may be that once a tool is rolled out and used to process a larger and more diverse group of people, new problems come to light. Second, many ADM systems are not static. Their rules will update and change as they are fed new inputs. This is true of machine learning systems, such as the ones described in this article.
If automation is going to deliver for us as a society, we need to ensure it is not simply making prejudicial or discriminatory decisions at a faster rate than humans can. The first step towards that goal is transparency. If we can’t see a system, we will not even know if causes harm, let alone be able to challenge it.
By tracking and analysing examples of ADM in government, the TAG register is an important part of making sure ADM use is legal, fair, and non-discriminatory. If it doesn’t operate in this way, then it is not fit for purpose. By facilitating a clearer understanding of the risks associated with the deployment of ADM, the TAG register will help to ensure that individuals affected and organisations representing their interests are better placed to challenge these decisions. We also hope that the TAG Register will set an example, encouraging government to think harder about what meaningful transparency looks like, and how best to achieve it.
The TAG register is there for all to use and contribute to. If you know of any ADM tools that do not currently feature on the TAG register, or if you have any further information about tools already on the register, please get in touch. Help us develop a picture of how these tools operate, so we can spot discrimination and challenge it before it causes more harm.
- 1R (Bridges) v Chief Constable of South Wales Police and others  EWCA Civ 1058.
- 2See Michael Savage, ‘DWP urged to reveal algorithm that ‘targets’ disabled for benefit fraud’ (21 November 2021, The Guardian) and ‘GMCDP & Foxglove Legal Challenge to the Department for Work and Pensions DWP Fraud Algorithm’, GMCDP news page.