Code execution with MCP: building more efficient AI agents

Code execution with MCP: building more efficient AI agents

If you’ve ever watched an agent stall while it loads hundreds of tool definitions, you know that the cost is more than monetary, it’s time and frustration too. The Model Context Protocol, or MCP, is solving tool fragmentation by creating a universal way for agents to talk to external systems. Since its launch in November 2024, adoption has exploded, but a new problem showed up as folks connected dozens or hundreds of servers: context bloat, slow responses, and soaring token bills.

Here’s the neat trick, and yes this feels like a practical hack you’ll want to use: present MCP servers as code APIs, not direct tool calls. Instead of stuffing every tool definition into the model’s context, the agent writes code that reads only the files it needs from a filesystem-like view of MCP servers. The payoff is huge. In one example, token usage dropped from 150,000 to 2,000 tokens, a 98.7% saving. That’s not a typo.

Why it works, briefly: models are good at writing and navigating code, and an execution environment lets you filter, transform, and aggregate data before anything ever hits the model. Large meeting transcripts, spreadsheets with 10,000 rows, or multi-step workflows can be handled in code, with loops, conditionals, and error handling done locally, which also reduces latency.

There are real safety wins too. Sensitive fields can be tokenized at the client, flowing between systems without ever entering the model’s context. And agents can persist state and reusable functions (think Skills), so the system gets smarter over time.

Of course, code execution needs secure sandboxes, resource limits, and monitoring, so weighing implementation cost versus benefit matters. But if you want agents that scale, are faster, and cost less, this pattern is worth trying. Dive deeper here: https://www.anthropic.com/engineering/code-execution-with-mcp

Code-Ausführung mit MCP: effizientere KI-Agenten bauen

Wenn du schon einmal zugesehen hast, wie ein Agent beim Laden hunderter Tool-Definitionen ins Stocken gerät, kennst du das Gefühl: es kostet Zeit und Nerven. MCP (Model Context Protocol) schafft eine einheitliche Verbindung zwischen Agenten und externen Systemen. Seit November 2024 nutzen viele Entwickler MCP, aber mit wachsender Zahl verknüpfter Tools stieg auch die Kontextlast.

Eine einfache, aber mächtige Idee hilft: behandle MCP-Server wie Code-APIs statt direkte Tool-Aufrufe. Der Agent erzeugt ein Dateisystem-ähnliches Verzeichnis aller Tools, liest nur die benötigten Dateien und führt Logik in einer Ausführungsumgebung aus. Ergebnis: enorme Einsparungen. In einem Fall sank der Token-Verbrauch von 150.000 auf 2.000 Tokens, also 98,7% weniger.

Der Grund ist simpel, und das solltest du mögen: Modelle sind gut im Programmieren, und wenn du Ergebnisse vorfilterst, bevor sie das Modell sehen, vermeidest du Token-Ballast. Große Protokolle, Riesen-Tabellen oder komplexe Abläufe lassen sich per Code vorverarbeiten. Gleichzeitig bleibt sensible Information geschützt, weil PII tokenisiert werden kann, bevor sie ins Modell gelangen.

Außerdem lassen sich Zwischenstände und wiederverwendbare Funktionen speichern (Stichwort Skills), sodass Agenten im Laufe der Zeit effizienter werden. Klar, das bringt Infrastruktur-Aufwand mit sich, du brauchst sichere Sandboxes und Monitoring, aber die Vorteile bei Latenz, Kosten und Komposition sind überzeugend.

Wenn du Agenten bauen willst, die wirklich skalieren, lohnt sich dieser Ansatz. Mehr dazu hier: https://www.anthropic.com/engineering/code-execution-with-mcp