Back to Journal
Developer Tools 11 min read

How to Design Software and APIs That AI Agents Can Actually Use

Agents now read your repos, call your APIs, and scaffold their own code. Designing for them is a different discipline than designing for human eyes and mouse clicks.

Key Takeaways

  • An AI agent is now a first-class consumer of your software, and it perceives your product through text, types, and exit codes, not through a rendered screen.
  • GUI-only and Figma-only design systems are effectively invisible to agents; if a capability has no machine-readable surface, the agent cannot find or use it.
  • Design for agents with six properties: machine-readable surfaces, headless access, deterministic and idempotent operations, typed contracts with predictable errors, discoverable naming, and sandboxed least-privilege execution.
  • Predictable, recoverable error shapes matter more than happy-path ergonomics, because an agent's main loop is observing a result and deciding what to do next.
  • Good function-calling interfaces are small, single-purpose tools with precise descriptions and structured arguments, not one mega-endpoint with a free-text blob.
  • Every agent action should be traceable, so you can audit, debug, and constrain what autonomous code actually did to your systems.

For thirty years we designed software for one kind of user: a human with eyes, a cursor, and patience for onboarding. Every affordance assumed a person who could read a tooltip, infer intent from layout, and click through an error dialog. That assumption is now wrong half the time. A growing share of the traffic hitting your product is an AI agent that reads your repository, calls your API, scaffolds code against your SDK, and decides what to do next based on the bytes you return.

Agents do not perceive software the way people do. They cannot see a rendered screen, cannot interpret a beautiful Figma frame, and cannot guess that the gear icon means settings. They perceive your product through text, types, structured responses, and exit codes. Designing for that consumer is a different discipline — and most software is, right now, accidentally hostile to it. This is a guide to building agent-legible software: systems an autonomous program can discover, operate, and recover from without a human in the loop.

What changed: the agentic-developer shift

The shift is not theoretical. Coding agents already clone repos and read the source to understand how a library works. Tool-using assistants call live APIs to complete tasks. Build agents scaffold entire features from a prompt. In each case the agent is doing what a developer used to do — reading docs, calling endpoints, wiring components — except it does it from raw text at machine speed, with none of the human ability to squint at a UI and figure it out.

That inverts a core assumption. The most important interface to your software is no longer the one a person looks at. It is the one a program can parse. If a capability of your product has no machine-readable surface, then for a rapidly growing class of users that capability does not exist. A button with no underlying documented endpoint is invisible. A design token that lives only in a Figma file is unreachable. A feature you can only reach through a multi-step wizard is, to an agent, a locked door with no handle.

Why are GUI-only and Figma-only design systems invisible to agents?

Design systems are the clearest example of the problem. A typical design system lives as a Figma library plus a styleguide site: gorgeous, precise, and completely opaque to a program. An agent asked to build an on-brand component cannot open Figma, cannot read a screenshot of your color palette with any reliability, and cannot infer your spacing scale from a marketing page. So it does what models do under uncertainty: it makes something up. You get a slightly-wrong blue, an invented border radius, and spacing that drifts from your system on every generation.

The failure is not the model's. It is that the design system has no surface the model can read. The fix is to treat machine-readability as a first-class output of the design system, not an afterthought. Tokens should ship as JSON. Components should ship as real, importable code. Documentation should be available as plain text an agent can fetch. We go deeper on the aesthetic consequences of getting this wrong in our piece on building an AI agent for your business, where the same idea applies to internal tools.

The principles of agent-legible software

Across the systems we build at Game Changer Labs, the same handful of properties separate software an agent can use from software it merely bounces off. Treat these as a checklist.

1. Machine-readable surfaces

Every capability needs a representation a program can parse without a human. In practice that means three artifacts. Publish your design tokens as JSON so colors, spacing, and typography are values, not pixels. Publish your API as an OpenAPI spec so an agent can enumerate operations, parameters, and response shapes. And publish an llms.txt at your root: a curated, link-rich text index that points agents at the docs, schemas, and endpoints that matter, instead of leaving them to scrape and guess. These three files turn a product from something you look at into something a program can reason about.

2. Headless and CLI access, not just GUIs

If the only way to use a feature is to click through an interface, an agent cannot use it. Every meaningful capability should have a headless path: a CLI command, a callable API, or a library function. A command-line interface is especially agent-friendly because it is text in, text out, with a clear exit code — exactly the shape an agent's tool loop expects. The GUI is for humans; the headless surface is for everyone, including the humans who want to script things.

3. Deterministic and idempotent operations

Agents retry constantly. They hit timeouts, lose connections, and re-run steps inside planning loops, so any operation can fire more than once. Two properties make that safe. Determinism: the same inputs produce the same outputs, so an agent can predict and verify results. Idempotency: running an operation twice is indistinguishable from running it once, usually via an idempotency key on creates and updates. Without these, a single retry can double-charge a customer or create duplicate records, and an autonomous loop will find that edge case faster than any human tester.

4. Typed contracts and predictable error shapes

The most underrated part of agent design is the error path. An agent's main loop is: call something, read the result, decide what to do next. That decision is only as good as the result you return. A 500 with an HTML stack trace tells the agent nothing actionable. A structured error with a stable machine-readable code, a human-readable message, and a hint about whether the operation is retryable lets the agent recover on its own — back off and retry, fix an argument, or escalate. Type your inputs and outputs, and make every failure look like data, not like an exception that escaped.

{
  "error": {
    "code": "invalid_argument",
    "field": "radius",
    "message": "radius must be one of: sm, md, lg, full",
    "retryable": false
  }
}

5. Discoverability and predictable naming

An agent finds capabilities by reading names and descriptions, so naming is a functional interface, not a cosmetic choice. Resources, commands, and parameters should follow a consistent, guessable convention. If listing is list in one place it should not be fetchAll in another. Predictable naming means an agent that has seen part of your API can correctly guess the rest, which is exactly the generalization that makes tool use reliable.

6. Least-privilege execution and sandboxing

The moment an agent writes and runs code, you have invited untrusted input into your runtime. Agent-legible does not mean agent-trusted. Tool execution must run under least privilege: no ambient credentials, a constrained filesystem, controlled network egress, and hard CPU, memory, and time limits. We run agent executors inside Firecracker-style microVMs so each execution is disposable and isolated from the host and from other tenants. The design rule is simple — assume the code the agent runs is adversarial, and make that assumption cheap to hold.

7. Observability and tracing of agent actions

When a program acts autonomously, you need to know exactly what it did. Every agent action — every tool call, every API request, every file write — should emit a trace you can inspect after the fact. Tracing is how you debug a misbehaving plan, audit what an agent touched, and prove to a compliance reviewer that an autonomous system stayed inside its lane. Without it, an agent is a black box making changes you cannot explain.

A worked example: how gcl-cli is built for agents

Our design tool gcl-cli is the principles above made concrete. It is a command-line tool — run with npx gcl-cli — built from the start for two kinds of users: human developers and the agents working alongside them. Its design language is obsidian glassmorphism, but the point here is the interface, not the aesthetic.

Running gcl-cli tokens emits the entire design system as machine-readable JSON. An agent does not look at a swatch; it reads exact values and uses them:

$ npx gcl-cli tokens
{
  "color": {
    "bg.base":    "#0B0B0F",
    "surface.glass": "rgba(255,255,255,0.04)",
    "text.primary": "#F5F5F7",
    "accent.iris":  "#6E62FF"
  },
  "radius": { "sm": 8, "md": 16, "lg": 24, "full": 9999 },
  "space":  { "1": 4, "2": 8, "3": 12, "4": 16, "6": 24 }
}

That single command is the difference between an agent that guesses your brand and one that reproduces it precisely. The accent is exactly #6E62FF, the medium radius is exactly 16, and the agent never has to interpret an image to know it.

Running gcl-cli component goes further and writes a ready-made React component to disk — real code an agent can import and extend, not a picture it has to reverse-engineer:

$ npx gcl-cli component card --variant glass
  created  src/components/ui/glass-card.tsx
  tokens   color.surface.glass, radius.lg, space.4
  exit 0

Notice the properties at work: the output is deterministic, the operation is idempotent (running it again overwrites the same file rather than creating glass-card-2.tsx), the surface is headless and scriptable, and the exit code reports success cleanly. An agent can chain tokens and component to scaffold an entire on-brand interface without a human ever opening a design file. That is what agent-legible looks like in production.

How do you design good tool and function-calling interfaces?

Most teams will expose capabilities to agents through function calling, and the quality of those tool definitions decides whether the agent succeeds. A model picks a tool by reading its name, description, and argument schema — so those are the interface. A few rules carry most of the weight.

Anatomy of a good agent tool

design rules

What separates a tool an agent calls correctly from one it misuses.

ScopeSmall and single-purpose. One tool, one job. Resist the mega-tool that takes a mode flag and does five different things.
DescriptionState exactly when to use it and when not to. The description is a prompt the model reads at selection time, not documentation for later.
ArgumentsStructured and typed with enums where possible. Avoid a single free-text field that pushes parsing onto the model.
Return valueStructured data the next step can parse, plus a predictable error shape so a failed call is still actionable.

Prefer a handful of focused tools over one general-purpose endpoint. Overlapping or vague tools cause the classic failure mode where the model confidently calls the wrong one. Tight schemas with enumerated values constrain the model toward valid calls and turn many would-be runtime errors into impossible states. When you are choosing the frameworks to build all this on, our roundup of the best open-source AI agent and LLM tools covers the libraries we reach for, and our guide to choosing between on-device and cloud AI helps decide where the model that drives these tools should actually run.

Designing for both audiences at once

None of this means abandoning human-centered design. A great GUI still matters; people still need beautiful, legible interfaces. The shift is that the GUI is no longer the only interface, or even the primary one for a large and growing class of users. The same capability now needs two faces: a rendered one for people and a machine-readable one for agents — and the machine-readable face is the one most teams are missing today.

Building software this way — headless surfaces, typed contracts, JSON tokens, sandboxed execution, full tracing — is exactly the kind of systems work Game Changer Labs does when we design and ship products meant to be operated by humans and agents together. If you are making your product legible to the next generation of users, that next generation is already reading your API.

Frequently Asked Questions

How do you make an API usable by an AI agent?

Expose a machine-readable contract the agent can read without a human, such as an OpenAPI spec, JSON schemas, and an llms.txt index. Keep operations deterministic and idempotent so retries are safe, return structured typed errors the agent can branch on, and use predictable resource naming. The goal is that an agent can discover what is possible, call it, read the result, and recover from failure entirely from text.

What is llms.txt and should my project have one?

llms.txt is a plain-text file at the root of a site or repo that gives language models a curated, link-rich map of your most important documentation and endpoints. It is the agent-era equivalent of a sitemap or README written for machines. If you want agents to use your product correctly rather than guess from scraped HTML, an llms.txt that points to your API reference, schemas, and key guides is one of the highest-leverage files you can add.

Why are GUI-only design systems a problem for AI agents?

An agent cannot click a Figma frame or interpret a screenshot of a component library with any reliability. If your design tokens and components only exist as visual artifacts, the agent has no way to consume them, so it invents its own values and produces off-brand, inconsistent output. The fix is to publish tokens as machine-readable JSON and components as real code the agent can import, so the design system has a surface the agent can actually read.

Why do AI agents need idempotent operations?

Agents retry. They time out, lose connections, and re-run steps inside planning loops, so the same call can fire more than once. If an operation is idempotent, a duplicate call is harmless and the agent can recover safely. If it is not, a retry can double-charge, double-create, or corrupt state. Designing create and update operations around idempotency keys is what makes autonomous retries safe rather than dangerous.

How should I design tools for function calling?

Keep each tool small and single-purpose, give it a name and description that state exactly when to use it, and define structured typed arguments rather than a single free-text field. A model selects tools from their descriptions, so vague or overlapping tools cause wrong calls. Favor a handful of focused tools with tight schemas over one general-purpose endpoint, and make the return value structured so the next step in the loop can parse it.

How do you safely let an AI agent execute code?

Run it in an isolated sandbox with least-privilege access: no ambient credentials, a constrained filesystem, controlled network egress, and hard resource limits. We use Firecracker-style microVMs so each execution is disposable and cannot reach the host or other tenants. Pair isolation with full tracing of every action the agent takes, so untrusted, model-generated code can run without putting production systems at risk.

What does gcl-cli do for AI agents specifically?

gcl-cli is a headless design tool built for humans and agents alike. Running gcl-cli tokens emits the design system as machine-readable JSON, and gcl-cli component writes ready-made React components straight to disk. Because every capability is a deterministic command with structured output, an agent can pull exact tokens and scaffold on-brand UI without parsing a screenshot or guessing hex values.

Game Changer Labs

Have a project that needs to ship?

Game Changer Labs designs and builds production systems across AI, neurotech, civic, and spatial computing. Tell us what you are building and we will scope it.

Keep Reading

Published: March 22, 2026Game Changer Labs