AI Explained

What Is an AI Agent? From Tools to Systems That Act

An AI agent is the next step beyond a chatbot. A chatbot talks to you. An agent does things for you.

The difference sounds small when you say it out loud. In practice it changes what you can realistically offload to a machine, and it is the reason the phrase AI agent is suddenly everywhere.

How an AI agent differs from a chatbot

A regular language model is reactive. You ask a question, it answers. An agent takes a goal and works out how to achieve it. Usually it does this by taking multiple steps, using tools, reading the results, and deciding what to do next.

A chatbot asked to book a flight will tell you what to click. An agent asked to book a flight will open a browser, search fares, compare times, and return with an itinerary ready for you to approve. The loop is goal, plan, act, observe, adjust, until the thing is done.

If you have not used one yet, the cleanest mental model is a junior colleague who never gets tired. You give them a task and the rough shape of how to approach it. They go away, do the boring steps, and come back with a draft.

You check the work and either approve it or send it back. The thing in the middle that used to be a person is now software.

This is also why the same underlying model can be both a chatbot and an agent. The intelligence is the same. The harness around it is what changes.

For background on the model itself, the post on large language models is a useful primer before reading on.

How the architecture actually works

The architecture underneath usually looks like this. A large language model sits at the centre, doing the thinking and the planning. Around it sit tools the model can call: a web browser, a spreadsheet, a calendar, an email inbox, a code runner, a database.

The agent writes a short plan, picks a tool, calls it, reads what comes back, and decides what to do next. Every cycle of that loop is the model essentially asking itself a single question. Am I closer to the goal, and if not, what should I try next.

Illustration showing how an AI agent uses tools in a loop to plan, act and check progress

The tools are usually plugged in through a small piece of glue code. Until recently every vendor wrote their own glue, which made agents hard to move between systems. A more open standard called the Model Context Protocol arrived in late 2024. Published by Anthropic, it lets the same agent talk to many tools without custom code each time.

You can read the spec on Anthropic’s announcement page. The detail matters less than the direction. The plumbing is becoming standard, which means agents will keep getting easier to deploy.

What AI agents are good at today

This is where things get genuinely useful. Because an agent can chain actions together, it can tackle tasks that would take a human thirty minutes of tedious clicking and copying. Pulling fifty invoices out of an inbox and into a spreadsheet.

Reading a long contract and flagging the clauses that differ from your company template. Watching a set of competitor websites and writing a weekly note when anything material changes. Booking meetings across three time zones without twelve emails.

The UK picture is already moving. Major banks such as Lloyds and NatWest have been running internal trials where agents handle the dull parts of compliance paperwork. Startups in London are building agent-first legal assistants. These sit inside Companies House and Land Registry and do the grinding research that junior solicitors used to do at two in the morning.

Even HMRC has quietly started exploring agent tooling to help caseworkers reconcile records faster. The technology is not waiting politely on the horizon. It is in live use, usually with a human still approving the final step.

The small business case is just as real. A solo accountant can hand an agent a folder of receipts and get a clean ledger back in minutes. A two-person marketing team can ask an agent to draft weekly competitor briefings overnight.

A property manager can have an agent answer routine tenant queries and only escalate the ones that need a person. None of this needs a bespoke build. The off-the-shelf tools available today, paid by the month, already do most of it.

Two common misconceptions about agents

A common misconception is that agents are a different model from the chatbot you already know. They are usually the same model, with a harness built around it. Claude, GPT, and Gemini can each run as an agent if you wrap them in the right loop and give them tools.

The intelligence is in the model. The usefulness is in the scaffolding. That is why you will see the same underlying systems sold under very different names depending on what tools they ship with.

The other big misconception is that agents are autonomous in the science fiction sense. They are not. They take the goal you hand them, inside the permissions you grant them. They stop when they get stuck or when you pull the plug.

They do not set their own goals. They do not keep running when you close the window. They do not go out and make money on their own. The interesting question is not whether they are secretly plotting.

It is how much of your day job they can quietly remove. If you want a clearer picture of how the underlying model thinks, the explainer on what artificial intelligence actually means is a good companion read.

Where AI agents fail and how to limit the damage

The failure modes are worth knowing. Agents get into loops, retrying the same broken approach. They misread a tool’s output and confidently head off in the wrong direction. They hallucinate a step that looks plausible but does not match reality.

This is why early deployments almost always keep a human in the loop. Anything that moves money, sends a message to a real person, or makes a legal commitment goes through a human check. The agent drafts, the human approves.

The other practical risk is permission scope. An agent that can read your inbox is a tool. An agent that can also send email, transfer money, and post on your behalf is a much bigger surface area for a mistake.

The right starting setting is read-only access to a narrow set of data. You widen the scope only after you have watched the agent work on the same kind of task several times and seen it behave.

The technology is evolving fast, and the shape of what is possible is changing month by month. Long running memory, where an agent remembers what happened across sessions, is becoming standard. Better tool integration through protocols like the Model Context Protocol means agents can plug into most business software without custom code.

Multimodal capability, where the same agent can read a spreadsheet, watch a video, and listen to a meeting recording, is no longer a research demo. Each of these improvements quietly expands the set of tasks you can realistically hand off. The set is growing faster than most buyers realise.

What an AI agent actually costs to run

There is a practical cost model worth understanding. Agents consume more tokens per task than a simple chat, because every step in the plan, observe, act loop uses the model again. A task that takes a chatbot a few seconds of compute can take an agent a few minutes and a few pence worth of calls.

That cost is trivial for tasks that replace an hour of human time, but it is real. Sensible deployments measure cost per completed task and compare it to the alternative, not the headline per message price.

The pricing pages can look intimidating because they quote per million tokens. The simple way to translate that is to run a real task, watch the bill, and decide. A weekly report that took a junior researcher half a day might cost a few pounds to produce with an agent.

The numbers are usually small. The hard part is choosing tasks where the time saving is genuine, not just shifting work to checking the agent’s output.

How to try an AI agent in your week

If you are trying to work out whether an agent would help you, look at your week. List the tasks that are boring, repetitive, involve clicking around several apps, and rarely involve judgement. That is exactly the shape agents are good at today. Pay a small amount to try one on a single workflow.

Give it a narrow goal, watch it closely, and decide whether it saves time. This is the right way to meet the technology. It is also cheaper than hiring a consultant to tell you the same thing six months later.