What is AI bias and why does it keep happening?
AI bias is not a bug to be patched. It is built into how these systems are trained. Here is how it happens and how to use AI sensibly.
AI bias is what happens when a system’s outputs systematically favour or disadvantage some groups in ways the people using it would not want. It is rarely a villain at the keyboard. It is almost always the quiet accumulation of choices about data, objectives and feedback.
The short version
Bias is structural, not personal. It enters AI systems at three stages: training data, the model’s objective, and human feedback. Each on its own is reasonable. Together they produce outcomes nobody explicitly asked for.
- Bias starts with the data. If the internet shows mostly male chief executives, the model learns that as a pattern.
- It is then shaped by the objective. Models trained to predict “successful” past hires copy the demographics of who got hired before.
- Human feedback adds its own layer. Whose preferences shape the model becomes part of the product.
- Real examples include Amazon’s scrapped recruiting tool, biased US healthcare algorithms, and facial recognition that fails on darker skin.
- UK regulators now treat AI bias as a live consumer issue. “The computer said so” is no longer a defence.
What AI bias actually means
Ask an image generator for a photo of a doctor. Until recently, you would almost always have been handed a white man in a lab coat. Ask for a nurse and the same system would politely return a woman. Nothing in the prompt suggested a gender or a skin colour.
The model simply learned from the pictures it trained on. It absorbed what a doctor or a nurse tends to look like in the world it was shown. It confidently served that pattern back as if it were a fact. That gap, between what the model has seen and what is actually true or fair, is the easiest way to understand AI bias.
The word itself trips people up. In everyday English, bias sounds like a personal prejudice, a choice someone makes. In machine learning it is more boring and more structural.
A system is biased when its outputs unfairly favour or harm some groups. The people using it would not want that result.
AI bias in the training data: the starting weather
Bias enters AI at three main stages. The first is the training data. Large language models and image systems are built on vast libraries of text and pictures. The libraries are scraped from the internet, digitised books, news archives, forums and stock photo collections.
Whatever skews already sit in that pile are now the system’s view of the world. If English news coverage disproportionately names men as chief executives and women as caregivers, the model will quietly internalise that as a pattern. Take medical research. If studies are still dominated by white male patients, a diagnostic tool trained on that literature will be sharper for some patients than for others.
The data is the starting weather, and almost every other problem grows out of it. For a related angle on what models get wrong, see our post on what AI hallucination is and why it happens.
AI bias in the objective: optimising the wrong thing
The second stage is the objective the model is trained to optimise. Modern systems are not given a goal like “be fair.” They are given a measurable score. The score might be predicting the next word, recommending content people click on, or approving loans that get repaid.
These scores are proxies for what we actually care about. Proxies leak. A hiring model rewarded for matching “successful” past candidates will reproduce the demographics of who previously got hired. In most industries, that is not a neutral starting point.
A news feed rewarded for engagement will learn that outrage drives more clicks than nuance, and will tilt accordingly. The system is not malicious. It is doing exactly what it was asked. It is the asking that was incomplete.
AI bias in human feedback: whose preferences shape the model
The third stage is where human judgement gets baked in. It uses reinforcement learning from human feedback, plus reward models that sit behind it.
Once a base model is trained, companies hire contractors to rate its answers. The raters prefer some and reject others. A smaller model is then trained to predict those preferences and used to shape the main model’s behaviour.
Whose preferences get collected matters. How the questions are phrased matters. Which answers are treated as polite, confident or hedging all becomes part of the model’s personality.
If the raters skew towards one culture or profession, the model’s idea of a good answer skews with them. If the guidelines tell raters to treat certain topics as settled, the model will later present those views as settled.
Reinforcement learning is how a messy base model is turned into a polished product. It is also how a particular set of tastes becomes invisible infrastructure.
Real examples of AI bias in the wild
Real examples are easier to remember than mechanisms. Amazon scrapped an internal recruiting tool in the late 2010s. The tool was downgrading CVs that contained the word “women’s.” The previous decade of engineering hires had skewed heavily male, and the model learned to copy that.
Health algorithms used in American hospitals were shown to under-refer Black patients to specialist care. The model used historical healthcare spending as a proxy for medical need. Because Black patients had historically received less care, the proxy reproduced the gap.
Facial recognition systems have repeatedly performed worse on darker-skinned faces and on women. Researchers at MIT documented the issue at scale, and the US National Institute of Standards and Technology later confirmed it across dozens of commercial systems. Each story is about proxies and data, not about anyone deciding to be unfair.
AI bias and the UK regulatory response
Regulators in the UK and Europe are now treating AI bias as a live consumer issue rather than a research curiosity. The Financial Conduct Authority has flagged algorithmic decisioning in lending, insurance and fraud detection. Firms must explain outcomes and test for disparate impact on protected groups.
The Information Commissioner’s Office has issued detailed guidance on fairness under UK GDPR. It makes clear that “the computer said so” is not a defence if a model produces discriminatory outcomes. The full ICO guidance on AI and data protection is the canonical UK reference.
At the European level, the AI Act now classifies high-risk systems. The list includes hiring, credit scoring, education and law enforcement. There are specific obligations around data quality, logging and human oversight.
None of this makes the problem vanish. It does mean a firm selling or using these tools in Britain can no longer shrug when asked how it tested them.
A worked example
Suppose a UK lender uses an AI model to score loan applications. The model is trained on ten years of past decisions and outcomes.
Approval rates were historically higher in some postcodes than in others. Some of that was legitimate income differences. Some was legacy assumptions in the bank.
The new AI model learns both patterns and treats them as equally valid. It quietly declines a higher share of applications from the same postcodes the bank used to under-serve. The model is more accurate than the old process by the bank’s chosen metric. It is also reproducing bias against a protected group, and the FCA can ask the lender to prove otherwise.
The fix is not to switch the model off. It is to test outcomes across protected groups and document the gap. Then retrain or constrain the model, and keep a human in the loop on borderline cases. This is now the regulatory expectation, not a nice-to-have.
What this means for you
For anyone using these tools, the practical takeaway is not paranoia but habit. Treat AI outputs like advice from a confident but unfamiliar colleague who has read a lot and met almost no one. It is often right, occasionally wrong in predictable directions, and rarely tells you which is which.
When a model helps with something consequential, ask one question. Would the same answer come back if the person or case were different in ways that should not matter? Consequential here means a shortlist of candidates, a lending decision, a medical question, or a legal summary.
If you cannot test that, assume the answer is provisional and keep a human in the loop. For the privacy side of using these tools, see our post on what information you should never put into an AI tool.
In plain English
AI bias is what happens when a smart system quietly carries forward unfair patterns from the world it was trained on. It is not a personal opinion. It is a default the model learned.
The fix is not to trust the machine less in general. It is to test it for the cases that matter, and to keep a person in the loop where the answer changes someone’s life. AI bias does not require malice. It does require attention.