AI Explained

What can go wrong when AI agents act on your behalf?

AI agents don't just answer questions, they take actions. Here's what can go wrong when they act on your behalf, and how to reduce the risk.

There is a meaningful difference between an AI that gets things wrong in conversation and one that gets things wrong while acting on your behalf. When a chatbot produces a dodgy answer, you ignore it and move on. When an agent books the wrong flight or sends an embarrassing email to a client, the damage is done before you know it.

AI agents are becoming more capable and more widely deployed. They can browse the web, fill in forms, send messages, and schedule meetings. They can make purchases, execute code, and connect to the tools and accounts you use every day. That capability is genuinely useful. It is also where the risk sits.

Understanding what can go wrong with AI agents is not an argument against using them. It is an argument for using them thoughtfully, and for knowing which situations call for a bit more caution. Understanding what AI agents can actually do today is the useful starting point; this post is about where that capability creates risk.

Why AI agent mistakes are harder to reverse

Most of the things you do with a chatbot are reversible by default. You ask a question, get an answer, and decide what to do with it. Nothing changes in the world until you act. AI agents work differently. They are designed to act, and many of those actions are hard or impossible to undo.

A sent email cannot be unsent. A file deleted without a proper backup is gone. A purchase made in your name generates a real order, a real charge, and a real expectation from whoever is on the other side. An API call that modifies a database record may ripple across systems in ways that are difficult to trace and expensive to reverse.

The irreversibility problem is compounded by the fact that agents often work quickly and silently. Several actions may have already happened before you notice that something has gone wrong. Speed, in this context, is a risk multiplier rather than a feature.

Three ways AI agents go wrong

They are only as good as their interpretation of what you actually want. The instruction “clear my inbox” sounds clear enough. An agent might interpret it as archiving everything, marking everything as read, or in a worst case, deleting messages it cannot recover. The instruction “book me a table for Friday” leaves open which restaurant, which Friday, and who is paying.

This is not a theoretical risk. Language models are confident, and they do not always ask for clarification when they should. They fill in the gaps using reasonable-sounding assumptions. Those assumptions can be quietly wrong in ways that do not become obvious until the action is already taken. The more ambiguous the instruction, the more carefully you need to review what the agent plans to do before letting it proceed.

The second problem is permissions. To send an email on your behalf, they need access to your email account. To book a meeting, they need your calendar. To make a purchase, they may need a payment method. They sometimes use broader access than the task actually requires. An agent given access to your entire Google Drive to find one document can, in principle, read, modify or share anything in that drive. Good agent design limits permissions to what a specific task requires and nothing more. Many current deployments do not meet that standard.

The third problem is cascading errors. An agent tasked with preparing the monthly client report might retrieve the wrong dataset and build a report on those numbers. It could distribute that report across your organisation before anyone realises the underlying figures were from the wrong month. Correcting one email is straightforward. Correcting the assumptions now embedded in a dozen people’s thinking is a different kind of problem. The risk multiplies when agents are connected to other agents, which is increasingly common. One agent’s output becomes another’s input, and errors compound faster than they can be spotted.

Privacy matters here too. Agents handling sensitive information carry the same data risks as any other tool with account access. The NCSC guidance on using AI tools at work covers the security considerations worth reviewing before granting broad permissions.

How to use AI agents more safely

The safest approach is agents that show you the planned action and wait for approval before proceeding. Those that act first and tell you afterwards are considerably harder to supervise. This is sometimes called a human-in-the-loop approach, and it is worth insisting on it for anything that is hard to undo.

Beyond that, the principle is simple: match the level of supervision to the cost of getting it wrong. Letting an agent draft a social media post and save it as a draft is low risk. Letting it send without review is higher risk. Letting agents place orders or modify financial records without a checkpoint deserves careful thought before going live.

Keeping permissions narrow matters too. When you connect these tools to an account or service, give them only the access the specific task requires. Most serious agent platforms allow for scoped permissions, and using them is worth the extra setup time. Our post on when not to use AI covers the situations where AI agents should not be the final decision-maker.

The difference between a useful AI agent and a problematic one often comes down to how much autonomy it has before the first human review. Start small and expand from there.

Think of AI agents as a very fast, very capable junior colleague who does not always know when to stop and ask. That colleague might execute a task brilliantly. They might also interpret an instruction in a way you would never have anticipated, and complete it before you had a chance to check. The skill is in knowing when to give them latitude and when to stay close to the work.

A useful test before deploying any agent on a consequential task is to run it in dry-run or preview mode first. Most platforms that offer AI agents also offer a mode where the agent describes what it intends to do without actually doing it. Running that preview costs almost nothing. It frequently reveals assumptions you would not have anticipated. An agent planning to archive your sent folder when asked to “tidy up” is far less alarming at the preview stage than after the fact.

The same principle applies when handing AI agents access to external services. Start with a single, narrow permission and observe how the agent behaves before expanding its access. This is slower than giving the agent full access up front. The setup time is usually less than diagnosing a problem caused by an agent operating where you did not expect it.

The agents available today are genuinely useful tools. They are not, however, reliable enough to operate without oversight on anything consequential. Knowing what is safe to delegate and what needs a human check is, right now, one of the more important skills in using AI well. That judgement improves with experience, and it starts with understanding how these tools can fail.