The use of machine learning algorithms to predict and detect benefit fraud is becoming increasingly common. Lighthouse Reports recently shone a light on the city of Rotterdam’s use of an algorithm from the years 2017-2021 which gave each of its citizens a fraud risk score. Those who received one of the approximately one thousand highest risk scores each year were selected for investigation. Rotterdam’s tool had similarities with the Dutch Government’s use of a system called Systeem Risico Indicatie (SyRI) which linked data from a vast set of Government Departments to generate fraud risk reports about individuals which would result in fraud investigations. The Hague District Court found the system to be in violation of Article 8 of the ECHR due to its lack of transparency and the intrusion of privacy through the linking of personal datasets across government. In the UK, the Department for Work and Pensions is using at least five machine learning models to identify potential fraud by applicants for Universal Credit, the UK’s living support payment for those out of work or on low incomes. The Department has suspended payments to individuals while they are awaiting investigation. It has stated to the National Audit Office that its ability to test for whether the machine learning systems discriminate on the basis of protected characteristics is currently limited.
One of the current problems with automated fraud detection systems is the lack of transparency. Firstly, individuals often do not know that an automated tool was used in making a decision to investigate them. Secondly, they are not typically informed of the decisional criteria, input variables or risk indicators used by the system to select them. Without this transparency there is no way to assess if the decisional criteria used are accurate, relevant, discriminatory or causally reliable.
There are various reasons why governments do not wish to provide information about the algorithmic tools they are using, including on commercial confidentiality grounds and due to privacy concerns. One of the predominating reasons for withholding information about these tools is that disclosure will allow bad actors to ‘game’ the system by manipulating their answers or profile to illegitimately receive a favourable outcome.
This blog post argues that a blanket refusal to disclose any of the risk model indicators or decisional criteria used by these systems does not adequately balance the risks of procedural unfairness for citizens against the risks of gaming and manipulation of these fraud detection systems.
Requirements for transparency under Art 41 CFR
Transparency deficits are concerning from a procedural fairness perspective. Appelman makes the point that a lack of transparency makes a broader assessment of the fairness of automated systems and whether or not they are discriminatory virtually impossible. In the SyRI case the Dutch Government refused to disclose the risk model and indicators, the threshold values for a fraud investigation or the types of data used. Citizens were never notified that a risk profile had been generated about them.
The Hague District Court used this lack of transparency as the foundation for its conclusion that the system violated Article 8 of the ECHR. Appelman notes:
The Court made clear that, at a minimum, insight must be given into “the risk indicators and the risk model, or at least … further legal safeguards to compensate for this lack of insight”. Additionally, insight needed to be given into “which objective factual data can justifiably lead to the conclusion that there is an increased risk”. …This opacity and lack of information also greatly inhibits the ability of the people involved to exercise their rights or defend themselves, especially since they are at no point informed of their (passive) involvement.
A lack of transparency around automated decision-making tools does not only risk violating Article 8 of the ECHR. Article 41 of the EU Charter of Fundamental Rights holds that every person has the right to have their affairs handled ‘impartially, fairly and within a reasonable time by the institutions, bodies, offices and agencies of the Union’. And this includes the right of every person to be heard, the right to have access to their file and an obligation on the administration to give reasons for its decision.
The Court of Justice of the European Union have interpreted this as requiring a duty of care in administrative decision-making. This duty of care requires administrators ‘to conduct a diligent and impartial examination’ of the matters before it and ‘take into account all relevant information prior to arriving at an administrative decision’.
Where administrators will not disclose the input variables, decisional criteria or risk indicators used by automated systems in reaching a decision, an assessment of whether only impartial and relevant information was incorporated into the decision-making process is very difficult. Individuals cannot meaningfully exercise their right to be heard or defend themselves against an allegation of fraud when they do not know which features of their profile caused them to be selected for investigation. Equally the requirement to provide reasons cannot be satisfied where a government refuses to disclose the decisional logic or criteria used to select an individual for investigation, as occurred in the SyRi case.
‘Gaming’ the algorithm
The risk that important information held by governments will be misused if it is made public is an age-old concern that long pre-dates automation of government decision-making. Both governments and the private sector keep all sorts of information secret including techniques for crime detection, calculation of credit scores, and credit card fraud detection methods, on the basis of preventing public abuse or manipulation of the information.
Section 31 of the UK’s Freedom of Information Act exempts government departments from disclosure wherever that information is likely to prejudice ‘crime prevention and enforcement’ or ‘immigration control’. Section 31 of the Act is often relied upon by government departments when individuals or civil society organisations seek information about algorithmic tools in use. For example, the Information Tribunal recently upheld the Home Office’s refusal, in reliance on s 31, to disclose 5 of the 8 criteria used by its automated triage tool for identifying likely sham marriages.
The UK government is not alone in its reluctance to provide information about its automated systems. The Dutch government refused to disclose how the SyRI system operated on the basis that disclosure could result in manipulation by the general public which would hinder its effectiveness.
Lighthouse Reports worked for two years using Dutch Freedom of Information laws to receive an executable version of the model used by the city of Rotterdam to identify ‘high risk’ individuals. While the level of disclosure in this case was impressive, the city of Rotterdam provided the information once it was no longer using the particular model in its decision-making. It has not provided disclosure of the current model it is developing.
The risk that an automated system might be hacked, personal data compromised or cyber security breached, appears a compelling reason for non-disclosure. However, it is not just the technical aspects of these tools which suffer from a transparency deficit, the decisional criteria or risk indicators used by these tools in selecting individuals are generally kept secret. Without disclosure of at least some of these features it is very difficult for individuals to defend themselves against allegations of fraud.
Why the risk of gaming argument cannot justify a blanket refusal of disclosure
Busuioc, Curtin and Almada note ‘the effectiveness of secrecy as an antidote for gaming is far from uncontested in the technical literature’. This means that the very real threat to procedural fairness from secrecy is being traded against an uncertain benefit in terms of preventing gaming risks.
Furthermore, much of the computer science literature suggests that systems can be designed to use criteria which mitigate against gaming risks. Where an automated system uses fixed or immutable characteristics about an individual such as their age, sex, race, nationality, height, family history or criminal record they cannot be gamed. Generally any criteria which is ‘not based on user behaviour offers no mechanism for gaming from individuals who have no direct control over those attributes’.
The disclosure of at least some risk indicators used by these systems allows for an assessment of their relevance and the risk of discrimination. The Court found in the SyRi case that: ‘due to the absence of verifiable insights into the risk indicators and the risk model as well as the function of the risk model’ it could not establish whether the threat of discrimination was ‘sufficiently neutralised’. Nationality was used as one of the criteria in the UK government’s automated visa streaming tool for entry into the UK. Nationality is not a ‘gameable feature’ as nationality cannot be easily changed. The lack of transparency around the visa streaming tool meant nationality was used for years as a criterion without public knowledge. Similarly in the Rotterdam case, Dutch language skills were used as a risk indicator by the system. The city of Rotterdam has now ‘concluded that the Dutch language variables were an unacceptable proxy for migrant background’. In the Rotterdam case, the length of someone’s last relationship, how many times they had emailed the city and whether they played sports were all datapoints inputted into the system. It is an obvious point that individuals cannot contest the relevance of these inputs without knowledge of their use.
If automated tools use self-reported features which are independently verified by administrators gaming is also much less of a risk. This is already how governments operate with respect to confirming an individuals’ eligibility for government programmes in cases where the eligibility criteria are publicly available. Series and Clements give the example of Local Authorities in the UK using Resource Allocation Systems (RASs) for deciding how to allocate community care budgets. The authors argued that Local Authorities should disclose how these community care budgets are calculated as Local Authorities are under a duty to independently verify the information provided by claimants which if conducted properly will mitigate any risk of manipulation.
Without the weightings that are accorded to the risk indicators or decisional criteria, it is also difficult for individuals to manipulate their answers to receive a favourable outcome. This is why banks typically disclose the factors used in calculating credit scores and loan approvals without disclosing the exact weightings. In the Rotterdam system which used over 315 datapoints it is extremely difficult to know how each factor influences the overall outcome, without a fully executable version of the model.
Transparency of the risk indicators used by fraud detection systems are essential if individuals are able to meaningfully exercise their right to be heard, their right to object to unfair decisions and their right to reasons for decisions made about them. While gaming of these systems is a legitimate risk, a blanket refusal to disclose any of the risk indicators or decisional criteria used, shifts the dial too far in favour of states deploying these systems and threatens procedural fairness protections for citizens.
Alexandra Sinclair is a doctoral student at the LSE and a Research Fellow at the Public Law Project. Alexandra has an LLB with first class honours from Victoria University of Wellington and an LLM from Columbia Law School in New York City where she studied as a Fulbright Scholar. She has been published in the Modern Law Review and is a frequent contributor to the UK Constitutional Law Association Blog. She has also written for the UK immigration law blog Free Movement, the Law Society Gazette and Prospect Magazine. Her doctoral research focuses on how administrative law doctrines can be applied to automated decision-making by government.