Agency over Agents

Table of Contents

AI hucksters want to sell us agents, but what we really need is agency.

As you might imagine, with this domain I'm frequently approached by speculators hoping to buy and flip it to one of the many projects around "Agentic AI". What is all the fuss behind this? Lots of people seem to be pursuing it. We read that it's a newer, better kind of AI where "agents" act independently and even cooperatively to work on tasks that we give them. People are working on agent operating systems and agentic frameworks. It sounds new, but I don't think it is. A few years ago, I found this taxonomy of software agents from 1996. Even then, people were arguing over what it took to be a real agent and the debate continues today.

People are promising us agents that do great things.

Looking at my favorite language resource, I find this top definition of agent:

One that acts or has the power or authority to act.

Central to our expectations is the idea that an agent has power and authority, in other words, trust.

Because we're talking about digital systems, the "doing" is almost always calling an API. If the API isn't an official public API, it's a private one, and it might even be a one-off bespoke interface used by the agent. But it's an interface for an application program (the agent), so it's an API.

APIs can do all kinds of useful things. Here are some examples of things that you can do with one:

Check your bank account balance.
Order a replacement part for your car.
File your tax return.
Make a restaurant reservation.
Buy plane tickets.
Play a song on your home audio system.

Each of these actions requires some type of authority, represented digitally as an "authorization". So one of the first things that people do when setting up AI agents is give them these authorizations, usually represented as API keys or tokens that they get from OAuth.

This could be a lot.

But can we trust those agents?

Dreamer was a much-hyped startup that was building a "platform for agents". Dreamer shut down early in an acquihire, and in his community letter discussing the transition, the Dreamer CEO celebrated the scope of tasks that people hoped to tackle with agents.

You've built and shared agents to manage email, calendar, and to-do's, create learning tools for your kids, learn new languages, plan trips with friends, become better cooks, help you with work, achieve your health goals, or simply to creatively express yourselves—all sorts of surprising and uniquely personal needs. These are agents as unique as you are, because they're shaped exactly the way you want them to be.

But the more that we enable our agents to do things that we might want them to do, the more they are able to do things that we don't want them to do. Who really wants to trust a large tech company, a startup, a government, or really anyone that they don't know with all this power?

When we hand our credit card to a server after we've finished a meal, we know that our credit card company will protect us from accidental or malicious overcharges. What protects us from AI agents? Is a fly-by-night project like Dreamer (which raised at least 56 million dollars before bailing out) setting anything but a negative example for what we can expect from AI agents?

As one anonymous commenter pointed out:

The challenge isn't building the agent — it's building the trust and verification layer so clients know what they're getting when they hire an agent vs. a human.

Agents are only as good a the tools that build them.

The biggest selling point of Dreamer was that it made it easy for people who didn't think of themselves as developers to make apps. In this interview with David Singleton, Dreamer looked like vibe coding without the code, although around 40 minutes in, we hear about the Dreamer CLI that lets developers get deep into generated code. This isn't the audience that Dreamer was built for, but realistically, it was probably the audience that Dreamer needed in order to get working apps on its platform. But how does this earn anyone's trust?

Agency is what the agent does.

Going back to Wordnik, agency is "the state of being in action or of exerting power." Agency is control over what the agent does, and as we've seen, what an agent does is its IO.

IO is what the agent does.

IO is a developer's tool that was created to make it easier for developers to use networked APIs (i.e. to do things). In doing this, IO also made it easier for developers and their applications to not do things that we don't want them to do.

IO empowers and observes agents.

IO is a proxy. That means that it sits between users and applications and between applications and the services that they use. When an application uses IO to check a bank account balance, the application sends the request to IO, IO sends the request to the bank, the bank responds to IO, and IO responds to the application. If the bank requires secrets to authorize the request (it should!), IO provides them and the application never sees our touches them. That means that the application can never leak or abuse those secrets. It also means that we can observe everything that the application does using IO's built-in record-keeping.

IO regulates agents using vouchers.

IO secures the credentials that applications need to make API calls. We can control which applications can use this credentials with workload identity, and we can add additional application-specific restrictions using an IO feature that we call vouchers. A voucher is an authorization token that can be presented with a request, usually along with a proof of identity, that describes additional restrictions that govern its use to call APIs.

Here are some of the things that can be specified in a voucher:

Only specifically named users could use it.
Only specifically named API operations could be performed.
Only requests with specific HTTP paths could be allowed.
Usage could be limited to a maximum number of calls.
The voucher might expire at a specified time.
Requests could only be accepted from certain IP addresses or address ranges.
Requests could be required to contain specified request field values.

Many of these restrictions go beyond the controls that we get with API authorization scopes. With that last one, we control usage. For example, we might allow a DNS management API to only create new text records for a specified domain, even though the API itself allows arbitrary new records to be created for any domain that our account owns.

Finally, IO's vouchers are extensible. Anyone to whom you give a voucher can use it to create a new voucher with additional restrictions that they can pass along to someone else. Restrictions accumulate, so the restrictions that are added can't be removed without invalidating the voucher.

Want to know more?

Follow @agent.io on Bluesky.

🦋 Comment with ATProto

0 likes | 0 reposts | 0 replies What do you think?