Home › Blog › The Ethics of AI in Policing and Criminal Justice

The Ethics of AI in Policing and Criminal Justice

4 weeks ago22 min read

Introduction: Technology in the System of Justice

Criminal justice systems exist to protect public safety whilst respecting human rights and the presumption of innocence. They’re inherently human systems, built on judgement, discretion, and accountability within legal frameworks. The introduction of artificial intelligence into policing and sentencing creates a fundamental tension: we’re replacing human judgement with algorithmic decisions at precisely the points where human rights are most vulnerable and the stakes are highest. Through my work with Inside Out Justice on prison reform and criminal justice, I’ve observed firsthand how technology intersects with justice systems, and the implications are profound and concerning.

The appeal of AI in criminal justice is superficially compelling. Algorithms don’t have unconscious bias, don’t get tired, don’t have bad days. They process information faster than humans and can incorporate more variables. Police forces facing budget pressures and rising demand see AI as potentially increasing efficiency. Courts dealing with sentencing backlogs see algorithmic assistance as potentially ensuring consistency. But this view fundamentally misunderstands both AI and justice. Algorithms do encode bias—they encode the biases present in training data and in the decisions about what matters. And justice requires not just efficient decisions but legitimate ones, grounded in accountability and transparency that algorithms often can’t provide.

Live Facial Recognition: The Met and South Wales Police

The most visibly controversial use of AI in UK policing has been live facial recognition (LFR) technology. The Metropolitan Police Service and South Wales Police have deployed cameras capable of real-time facial recognition, comparing faces in crowds against watchlists of wanted individuals or persons of interest. The technology operates in real-world conditions, with no notification to the public that recognition is occurring, and matches can trigger police intervention. The initial justifications were straightforward: find wanted criminals faster, increase public safety.

The practical results have been deeply troubling. Studies of South Wales Police’s LFR operations found that the system had a false positive rate of approximately 92%—meaning that roughly nine of every ten times the system flagged someone as a match, the person was not actually the wanted individual. This isn’t a minor technical failure. Innocent people were being stopped, detained, and interrogated based on false algorithmic matches. The discriminatory impact is also evident: the system disproportionately misidentified people of colour, a problem well-documented in facial recognition research but not adequately addressed in deployment.

What’s particularly concerning is that these systems were deployed with minimal public consultation, limited transparency about how they operated, and inadequate safeguards. The watchlists themselves are opaque—individuals don’t know they’re on them, can’t contest their inclusion, and can’t correct errors. A person mistakenly placed on a watchlist could face repeated unwarranted police stops based on algorithmic misidentification. The systems operate in a regulatory grey area where current law is ambiguous about whether their use is even legal.

The Bridges v South Wales Police Case

The case of Bridges v South Wales Police, heard in the High Court in 2020, represents the crucial legal challenge to LFR in the UK. Mr Bridges, a civil rights campaigner, challenged South Wales Police’s deployment of facial recognition technology on multiple grounds: that it violated his right to a fair trial, that it breached data protection law, and that the deployment hadn’t been authorised through legitimate legal channels. The case forced the question of whether police forces even had the authority to deploy these systems without specific legislative approval.

The High Court ruled that South Wales Police had indeed acted unlawfully. They found that the police force hadn’t properly considered privacy implications, hadn’t conducted adequate equality impact assessments, and hadn’t followed due process in deploying the systems. The judgement was significant not just for blocking the specific deployment but for establishing that law enforcement can’t simply adopt new surveillance technologies because they’re potentially useful. There must be genuine legal authority, proper process, and demonstrated necessity.

Yet the judgement didn’t close the door to facial recognition in policing entirely. The court suggested that LFR might be lawful if implemented with proper safeguards, appropriate use cases, and genuine legal authority. This leaves the question of whether and how LFR should be used as a live policy matter, unresolved by the courts. Various police forces and the Home Office have suggested different positions, but the fundamental tension remains: facial recognition offers law enforcement capabilities that can benefit public safety but poses such substantial risks to privacy, fairness, and due process that deployment requires exceptional justification.

Facial Recognition Technology: The Science and the Bias

To understand facial recognition in policing, it’s essential to understand the technology itself and its limitations. Facial recognition works by extracting facial features from images—the distances between eyes, nose shape, jaw line, etc.—and comparing them against known faces in databases. Modern systems use deep learning to extract these features, allowing systems to perform recognition in various lighting conditions, angles, and image qualities. In controlled laboratory conditions with high-quality images of cooperative subjects, modern facial recognition can achieve accuracy rates above 99%.

Yet police deployments don’t occur in controlled laboratory conditions. They occur in crowds, with varying lighting, with people not looking at cameras, with varying image quality. Performance degradation in real-world conditions is substantial. Additionally, the accuracy of facial recognition varies significantly across demographic groups. Systems trained primarily on white faces demonstrate substantially higher error rates on people of colour, particularly on women with darker skin tones. This isn’t incidental; it reflects training data and algorithm design choices that have embedded demographic bias into the systems.

The research by Buolamwini and Gebru on algorithmic bias in facial recognition systems demonstrated error rates of over 30% for dark-skinned females in some commercial systems, compared to error rates under 1% for lighter-skinned males. This disparity means that facial recognition systems, deployed in policing without adequate bias testing, are more likely to falsely identify people of colour and women as matches. The consequences of these false matches aren’t abstract: they translate into disproportionate police stops, interrogations, and potential wrongful detention.

Predictive Policing: The Illusion of Objectivity

Beyond facial recognition, predictive policing algorithms have been deployed by various police forces with the goal of identifying crime hotspots or high-risk individuals. These systems use historical crime data to train machine learning models that predict where future crimes are likely to occur or which individuals are likely to offend. The pitch is compelling: objective, data-driven allocation of police resources. Yet predictive policing represents perhaps the clearest example of how algorithms can automate and amplify existing biases whilst creating an illusion of objectivity.

Historical crime data reflects not actual crime distribution but policing distribution. Police are concentrated in certain neighbourhoods, certain communities are more frequently stopped, arrest records are skewed toward over-policed areas. When algorithms are trained on this data, they learn the patterns of policing, not the patterns of actual crime. A predictive algorithm trained on arrest data will predict that certain neighbourhoods should receive more police presence—and when police presence increases, more arrests occur in those neighbourhoods, generating data that confirms the algorithm’s prediction. This creates a feedback loop where algorithmic predictions generate the data that justifies the predictions.

The consequences are serious. Neighbourhoods already over-policed due to historical bias receive even more police presence based on algorithmic predictions. Residents in those areas experience more frequent stops, more interactions with police, more arrests. Young people in those areas grow up with criminal justice contact, limiting opportunities, affecting mental health, creating actual criminal pathways that the algorithm predicted. The algorithm becomes self-fulfilling. What appears to be objective, data-driven policing is actually the automation and amplification of existing discriminatory patterns.

Risk Assessment in Sentencing and Parole

Algorithmic risk assessment tools are used in sentencing and parole decisions in various jurisdictions. These systems take information about the offender, their history, and their circumstances, and generate a risk score predicting the likelihood of reoffence if released. Judges and parole boards use these risk scores in making sentencing and release decisions. The theory is that structured algorithmic assessment of risk is more accurate and less subject to bias than unaided human judgement. The reality is more complicated and concerning.

One of the most studied examples is COMPAS (Prison Offender Management Profiling for Alternative Sanctions), widely used in American criminal justice systems. Research by Dressel and Farid found that COMPAS had error rates around 30%, and more critically, error patterns showed racial bias—the algorithm more frequently overestimated risk for Black defendants (incorrectly predicting reoffence) and underestimated risk for white defendants. This bias affected sentencing recommendations and parole decisions, with demonstrable impacts on sentencing disparities between racial groups.

The problem is similar to predictive policing: the algorithms are trained on data reflecting the outcomes of historically biased systems. Sentences given in the past reflect the biases of past judges, policing patterns that affected arrest rates, and historical discrimination. The algorithm learns these patterns and replicates them. Additionally, the variables used in risk assessment often include factors that proxy for socioeconomic status or race—neighbourhood characteristics, employment history, family structure—creating opportunities for indirect discrimination.

The Black Box Problem in Justice

A fundamental problem with algorithmic decision-making in criminal justice is the opacity of the decision-making process. When a judge sentences someone, the sentencing reasoning is articulated in open court, is based on sentencing guidelines and precedent, and is potentially appealable based on articulated reasons. When an algorithm generates a risk score that influences sentencing, the basis of that score is often opaque—proprietary trade secrets, complex mathematical relationships, training data not disclosed publicly. A defendant might be influenced by an algorithmic risk score they can’t understand, can’t access, and can’t effectively challenge.

This creates a justice problem beyond the technical question of whether algorithms are accurate. It’s a due process problem. Defendants have a right to know the evidence against them and to challenge that evidence. When algorithmic risk scores are treated as evidence but are essentially unexplainable to the defendant, this right is compromised. An algorithm might rate someone as high-risk based on patterns in data that the defendant has no way of understanding or contesting.

The FCA’s explainability requirement for AI in finance is one approach to this problem—requiring that firms explain algorithmic decisions. Justice systems require something similar: algorithms used in sentencing or parole decisions should be transparent, explainable, and challengeable. The proprietary protection of algorithmic decision-making in criminal justice is essentially requiring citizens to accept limitations on their due process rights to protect commercial interests. This is fundamentally unacceptable.

Sentencing Consistency and Algorithmic Authority

One of the arguments for algorithmic risk assessment is that it improves consistency—removing the human variation in sentencing that leads to different sentences for similar offences. There’s a genuine concern underlying this argument: judges do have substantial discretion in sentencing, and two judges might sentence the same person differently based on personal philosophy or unconscious bias. Algorithmic consistency is superficially appealing as a solution.

Yet sentencing, properly understood, should include discretion. Judges are supposed to consider individual circumstances, exercise mercy where appropriate, and tailor sentences to fit particular people’s situations. The same crime can warrant different sentences depending on the offender’s circumstances, capacity for rehabilitation, impact on victims, and numerous other factors. Excessive consistency—reducing all cases to algorithmic formulas—risks losing the individuated justice that fairness often requires. Additionally, if the algorithm is biased, increasing consistency simply automates bias at scale. Biased consistency is worse than inconsistent judgement where the bias is sometimes mitigated by individual consideration.

There’s also a psychological risk: when judges or parole boards receive algorithmic risk scores, there’s a tendency to defer to the algorithm’s judgement, treating it as more objective than human judgement when it may not be. Studies show that decision-makers using algorithmic aids tend to use the aids as anchors, not adjusting substantially away from algorithmic recommendations even when they have good reasons to. This creates a situation where the algorithm effectively makes decisions, with humans providing post-hoc legitimation rather than genuine judgement.

The ECHR Implications and Human Rights Framework

The European Convention on Human Rights, which UK law incorporates, provides fundamental protections relevant to algorithmic decision-making in criminal justice. Article 5 guarantees the right to liberty and security, with restrictions only permissible on legal grounds and through proper procedures. Article 6 guarantees the right to a fair trial, including knowledge of the evidence against you. Article 8 guarantees respect for private life. Algorithmic decision-making in policing and sentencing potentially engages all of these rights.

The Bridges case established important principles: that surveillance technology affecting liberty and privacy requires proper legal authority, must be used proportionately, and must have adequate safeguards. Applied more broadly, this suggests that algorithmic decision-making in criminal justice—whether facial recognition, predictive policing, or risk assessment—must meet high standards of transparency, accuracy, fairness, and accountability. These aren’t merely technical requirements; they’re rights-protection requirements flowing from the human rights framework.

Additionally, the ECHR’s jurisprudence increasingly recognises that algorithmic decision-making affecting fundamental rights requires substantive, not just procedural, fairness. This means not just that proper process was followed, but that the outcomes reflect fairness and don’t perpetuate discrimination. For criminal justice AI, this creates an obligation to ensure that algorithms don’t systematically disadvantage particular groups, and that if bias is discovered, systems are not deployed or are substantially modified.

The Case for Extreme Caution

My position on AI in criminal justice is that extreme caution is warranted. This isn’t opposition to all AI use, but rather recognition that criminal justice is an exceptional domain where the stakes—loss of liberty, impact on fundamental rights—are extraordinarily high. This is not comparable to algorithmic recommendation systems or credit decisions, where mistakes are regrettable but manageable. A false positive in facial recognition in policing can lead to wrongful detention. A biased risk assessment can lengthen a sentence. A discriminatory predictive policing algorithm can distort entire neighbourhoods toward criminalisation.

This exceptional stakes argument suggests that AI should only be deployed in criminal justice when there’s clear evidence that it improves outcomes compared to human decision-making, when bias has been rigorously tested and demonstrated to be absent, when the system is transparent enough for defendants to challenge, and when there are robust safeguards and accountability mechanisms. In practice, very few current applications meet these criteria. Facial recognition deployed without adequate bias testing doesn’t meet them. Predictive policing trained on biased historical data doesn’t meet them. Risk assessment algorithms with proprietary decision-making don’t meet them.

The burden of proof should be on those proposing AI use in criminal justice to demonstrate that it will improve justice outcomes. Where evidence suggests bias, opacity, or inability to provide defendants fair process, the default should be not to deploy. The efficiency gains or cost reduction that AI offers are not sufficient justification when fundamental rights are at risk.

The Chilling Effect on Liberty and Privacy

Beyond the direct impacts on individuals subject to algorithmic decision-making, there’s a systemic chilling effect. When facial recognition operates in public spaces, when predictive policing concentrates police presence in particular neighbourhoods, when risk assessment affects sentencing, people’s behaviour and freedom are constrained. People become less willing to move through public spaces if they fear recognition and police contact. People become less willing to congregate in certain neighbourhoods if they’re subject to intensive policing. People become less willing to engage in activities that might generate data suggesting risk, even lawful activities.

This chilling effect is a rights impact separate from the accuracy of the algorithms. Even if algorithmic decision-making were perfectly accurate, its existence in criminal justice would constrain freedom of movement, freedom of assembly, and freedom to engage in activities that generate particular data patterns. This is not merely an inconvenience; it’s a fundamental limitation on liberty. When the majority of the population doesn’t directly experience this constraint (because they don’t live in intensively policed areas, don’t generate algorithmic flags), but particular groups do, the constraint becomes an engine of inequality.

I’ve observed this in prisons through my work with Inside Out Justice: when surveillance intensifies, whether through cameras, monitoring systems, or data collection, people’s behaviour becomes constrained in ways that reflect anxiety and loss of autonomy rather than genuine improved safety. The same principle applies to algorithmic policing in society. The experience of being under algorithmic surveillance is qualitatively different from being in communities with less surveillance, and this difference maps onto existing lines of inequality.

Necessary Safeguards and Governance Framework

If AI is to be used in criminal justice—and I believe very limited use might be appropriate in some contexts—it must be subject to rigorous safeguards. These would include: mandatory bias testing before deployment with particular attention to demographic disparities, with revalidation regularly; transparency requirements sufficient that defendants can understand how algorithms influenced decisions; right of appeal based on algorithmic decisions with burden on the system to justify the algorithm’s decision; regular auditing by independent researchers without access restrictions based on proprietary concerns; and clear legal frameworks establishing when and how algorithms can be used, with public consultation before any deployment.

Additionally, there should be restrictions on certain applications. Facial recognition in public spaces without consent and without evidence of necessity should be prohibited. Predictive policing should be particularly tightly restricted given its automatic feedback loop problem. Risk assessment algorithms should be used only as aids to human judgement, with explicit requirements that judges and parole boards articulate their reasoning and explain when they’re deviating from algorithmic recommendations.

Perhaps most importantly, any use of AI in criminal justice should be subject to an independent body with the authority to order suspension or cessation if evidence emerges of bias, inaccuracy, or rights violations. Currently, individual police forces or courts can deploy or discontinue algorithms independently. A national oversight mechanism would ensure consistency and enable proper monitoring across the system.

The Broader Question of Legitimacy

Beyond the technical questions of bias and accuracy, there’s a legitimacy question underlying algorithmic criminal justice: what is the relationship between legitimacy and transparency? A justice system’s legitimacy partly depends on public confidence that it’s fair, that decisions are made according to law, and that people can understand why decisions were made. When decisions are influenced by algorithms that operate as black boxes, even if those algorithms happen to be accurate and unbiased, legitimacy is undermined because people can’t see how the system works.

In my work on prison reform, I’ve observed that legitimacy matters profoundly. Prisoners and their families are more likely to accept adverse decisions when they understand the reasoning, when they have opportunity to be heard, when the process is transparent. Decisions that appear arbitrary or opaque breed distrust and resentment. The same principle applies to algorithmic criminal justice: even if algorithms were perfectly fair and accurate, their use would still undermine legitimacy unless they’re transparent enough that people can understand and contest decisions.

This suggests that even in cases where algorithms might technically improve outcomes, they shouldn’t be used if deployment undermines the perceived fairness and legitimacy of the criminal justice system. Justice isn’t just about outcomes; it’s about processes being fair, transparent, and legitimate. Algorithmic decision-making threatens this not through any technical failure but through opacity that separates decision-making from the transparency justice requires.

International Comparisons and the EU Approach

Different jurisdictions are taking different approaches to algorithmic criminal justice. The EU is moving toward stricter regulation, with the AI Act restricting certain uses of AI in law enforcement and criminal justice, requiring human oversight of critical decisions, and establishing transparency requirements. Some individual European countries have restricted or prohibited facial recognition in public spaces. The United States continues to use risk assessment in sentencing despite evidence of bias, whilst also having numerous jurisdictions adopting restrictions on facial recognition.

The UK’s approach has been relatively cautious but inconsistent. The Bridges ruling established important protections against surveillance technology deployment without proper legal authority. Yet the government has suggested interest in clarifying the legal framework to enable some facial recognition under proper conditions. There’s risk that ‘proper conditions’ will be defined too loosely, enabling deployments that replicate the Bridges problems in slightly different form.

International evidence suggests that cautious jurisdictions with restrictions on algorithmic surveillance and strong transparency requirements are successfully maintaining public safety without deploying the most aggressive AI systems. This suggests that the choice between justice and security is false—that it’s possible to have both, but requires rejecting the most extreme applications of AI in policing.

Technological Evolution and Governance Lag

A persistent problem with AI in criminal justice is that technology evolves faster than governance frameworks. Facial recognition, predictive policing, and risk assessment algorithms all emerged and began deployment before comprehensive regulatory frameworks were established. The Bridges case helped clarify some legal requirements, but gaps remain. Meanwhile, new applications of AI in criminal justice continue to emerge: behaviour detection algorithms, emotion recognition for deception, gang affiliation prediction. These systems start being piloted before their implications have been fully considered.

This governance lag is dangerous. It enables deployment of systems that haven’t been properly scrutinised for bias, effectiveness, or rights implications. By the time problems are identified and publicised, systems have been operating for years, generating data, creating dependencies. Correcting course becomes politically and bureaucratically difficult. What’s needed is forward-thinking governance that anticipates emerging AI applications and establishes frameworks before deployment, rather than responding reactively after problems arise.

This is particularly important given the trajectory of AI development. As AI systems become more capable, the temptation to deploy them in criminal justice will increase. Voice-based emotion recognition might be deployed to assess credibility. Gait recognition might supplement facial recognition. Behaviour analysis algorithms might predict future crime. Each new system presents the same potential problems: bias, opacity, rights impacts. Getting the governance framework right now, before these technologies are widely deployed, is essential.

Conclusion: Justice Requires Restraint

The ethics of AI in criminal justice ultimately comes down to a fundamental principle: criminal justice is where the state’s power over individuals is greatest, where fundamental rights are most at stake, and where the burden of proof should be on demonstrating that algorithmic systems improve outcomes and protect rights, not simply that they’re efficient or cost-saving. Current evidence doesn’t demonstrate this for most AI applications in policing and sentencing.

Facial recognition deployed without adequate bias testing is more likely to injustice than to justice. Predictive policing that creates feedback loops amplifying historical bias perpetuates discrimination. Risk assessment algorithms that are opaque and biased limit defendants’ ability to receive fair trials. These aren’t acceptable trade-offs for efficiency gains. A justice system built on algorithmic decision-making that can’t be transparently understood, that encodes historical bias, that limits human judgement and accountability, is a system that has abandoned the principles that make justice legitimate.

The path forward requires acknowledging that AI has limitations as well as capabilities, that criminal justice is an exceptional domain requiring exceptional restraint, and that legitimacy and fairness are as important as efficiency. This means restricting facial recognition until bias can be adequately controlled, examining and likely restricting predictive policing, and redesigning risk assessment to ensure transparency and contestability. It means treating criminal justice as an arena where the precautionary principle applies: when fundamental rights are at stake, the burden is on demonstrating safety before deploying new systems. For too long, that principle has been inverted. Correcting this requires political will and sustained commitment to putting rights protection before technological adoption.

Technology and Innovation

The Ethics of AI in Policing and Criminal Justice

Introduction: Technology in the System of Justice

Live Facial Recognition: The Met and South Wales Police

The Bridges v South Wales Police Case

Facial Recognition Technology: The Science and the Bias

Predictive Policing: The Illusion of Objectivity

Risk Assessment in Sentencing and Parole

The Black Box Problem in Justice

Sentencing Consistency and Algorithmic Authority

The ECHR Implications and Human Rights Framework

The Case for Extreme Caution

The Chilling Effect on Liberty and Privacy

Necessary Safeguards and Governance Framework

The Broader Question of Legitimacy

International Comparisons and the EU Approach

Technological Evolution and Governance Lag

Conclusion: Justice Requires Restraint

You May Also Like

Discover more from Scott Dylan

Scott Dylan

The Public’s Growing Distrust of AI: A Wake-Up Call for the Industry

AI Regulation: Why the UK Must Find Its Own Path Between the US and EU

AI Arms Race: Why Anthropic’s Stand Against Military AI Use Matters

Responsible AI Investment: What Nexatech Ventures Looks For in 2026

International Women’s Day 2026: Why Gender Equality in Tech Remains Unfinished Business

The Hidden Mental Health Toll of Social Media on Adults

Menu

Scott Dylan

The Ethics of AI in Policing and Criminal Justice

Introduction: Technology in the System of Justice

Live Facial Recognition: The Met and South Wales Police

The Bridges v South Wales Police Case

Facial Recognition Technology: The Science and the Bias

Predictive Policing: The Illusion of Objectivity

Risk Assessment in Sentencing and Parole

The Black Box Problem in Justice

Sentencing Consistency and Algorithmic Authority

The ECHR Implications and Human Rights Framework

The Case for Extreme Caution

The Chilling Effect on Liberty and Privacy

Necessary Safeguards and Governance Framework

The Broader Question of Legitimacy

International Comparisons and the EU Approach

Technological Evolution and Governance Lag

Conclusion: Justice Requires Restraint

You May Also Like

Discover more from Scott Dylan

Scott Dylan

Further reading

The Public’s Growing Distrust of AI: A Wake-Up Call for the Industry

AI Regulation: Why the UK Must Find Its Own Path Between the US and EU

AI Arms Race: Why Anthropic’s Stand Against Military AI Use Matters

Responsible AI Investment: What Nexatech Ventures Looks For in 2026

International Women’s Day 2026: Why Gender Equality in Tech Remains Unfinished Business

The Hidden Mental Health Toll of Social Media on Adults

Menu

Scott Dylan