The agent demo phase is ending. The infrastructure phase is starting.

For the last year, most arguments about AI agents have centered on model quality. Can the model plan? Can it call a tool? Can it recover from a mistake? Those questions still matter, but they are no longer the whole story. A useful agent is not just a model with a bigger prompt. It is a running system with instructions, external tools, private data, browser sessions, code execution, network access, secrets, logs, and a human fallback path.

That shift showed up clearly in two recent releases. On May 25, Red Hat published a practical breakdown of MCP servers versus skills, framing them as two different ways to supply context to a model. A week earlier, Anthropic and Cloudflare announced infrastructure for Claude Managed Agents that lets the agent loop remain managed while tool execution, sandboxes, and private service access can move into infrastructure the customer controls.

The common thread is simple: context is becoming an operational surface. Once an agent can act, teams need a control plane for what it is allowed to know, what it is allowed to call, where it is allowed to run, and how every action gets audited.

Skills teach, MCP connects

Red Hat's distinction is useful because it cuts through a lot of agent vocabulary. Skills are reusable instructions. They teach an agent how a team wants recurring work done: the output format, the process, the domain conventions, the checklist, the local standard. Skills are not mainly about reaching out to a live service. They are about giving the model structured know-how.

MCP servers solve a different problem. They expose tools, resources, and prompts through a standard protocol so an AI application can discover what is available, describe those capabilities to the model, and route tool calls through a typed interface. The official MCP architecture describes tools as executable functions, resources as context data, and prompts as reusable templates. That split matters. A model should not need raw API documentation, long-lived credentials, and a hope-based instruction to update a customer record. It should see a constrained tool with a clear schema and a scoped path to the service behind it.

In practice, serious agents will use both. A support agent might use a skill that teaches the company's refund policy and tone, then call an MCP tool to inspect the actual order. A coding agent might use a skill for a repo's review standard, then call an MCP server for CI status, issue metadata, or production logs. The skill shapes judgment. MCP supplies live reach.

The mistake is treating either one as a prompt accessory. They are part of the agent's authority model. The moment a skill tells an agent how to execute a workflow, or an MCP server gives it a write-capable tool, context stops being static documentation and starts behaving like infrastructure.

The sandbox is part of the product

Anthropic's Managed Agents update makes the other half of the pattern visible. The company now supports self-hosted sandboxes and MCP tunnels for Claude Managed Agents. The agent loop can remain on Anthropic's platform, while tool execution can run inside customer-controlled infrastructure or through managed providers such as Cloudflare, Daytona, Modal, and Vercel. MCP tunnels are meant to connect agents to private MCP servers without exposing those services as public endpoints.

Cloudflare's integration points in the same direction. Its writeup describes a Workers-based control plane that assigns each agent session a sandboxed environment for code execution, CLI tools, development work, and persisted state. Cloudflare also connects this to broader agent primitives such as Browser Run, where agents can drive browser sessions, expose Chrome DevTools Protocol access, record sessions, and let a human take over when a workflow hits an edge case.

This is not just hosting. It is the control surface around agency. Where does the code execute? Which network can it reach? How are sessions logged? Can the operator replay what happened? Can a human interrupt? Which secrets enter the agent process, and which stay behind a broker? Those are platform questions, not prompt questions.

The important architectural move is separation. The model can remain the reasoning engine while execution happens in a boundary the organization understands. The agent does not need to live in the same place as the database, the browser, the repo, and the credential vault. In fact, it probably should not.

Production agents need boring controls

The best agent infrastructure will look less like a chat product and more like a mix of CI, identity management, API gateway, and observability stack.

Tool registration needs review. Skills need source control. MCP servers need inventory, versioning, authentication, and permission scopes. Sandboxes need egress policy and cleanup rules. Browser sessions need recordings and redaction. Logs need to show which agent called which tool with which class of input, without leaking every sensitive argument into a searchable audit table.

That sounds dull compared with a model release, but it is the layer that lets agents do real work. A company can tolerate an agent drafting a proposal with loose controls. It cannot tolerate an agent with unclear write permissions across Jira, GitHub, Slack, Salesforce, internal databases, and a browser session that can log in as a human. As soon as the agent becomes useful, the blast radius becomes real.

This is also why the skills-versus-MCP distinction should not become a vendor slogan. The sharp question is not which abstraction wins. The sharp question is where each abstraction is governed. Who can install a skill? Who can publish an MCP server? Can the same agent use a private CRM tool and a browser in the same session? What blocks it from copying data between them? Can the operator see that it tried?

The control plane becomes the moat

Model quality will keep improving, and that is good. Better models make agents more capable. But as agents move from demos into workflows, the durable advantage shifts toward the harness: context delivery, tool policy, sandboxing, network boundaries, identity, state, and audit.

The near future of agent adoption will not be a clean line between companies that trust AI and companies that do not. It will be a line between teams that can make agent permissions legible and teams that cannot. The former will let agents touch more systems because they can contain and observe the work. The latter will keep agents trapped in toy workflows because nobody can say what happens when the model gets creative.

That is the real significance of the recent MCP, skills, sandbox, and browser-infrastructure work. The industry is slowly moving from asking whether agents can act to asking where action is allowed to happen. That is a healthier question. A powerful agent with no control plane is a liability. A powerful agent with explicit context, scoped tools, isolated execution, and useful logs starts to look like a new kind of software worker.

The model is still the brain. The control plane is what makes the hands usable.

Sources