Skip to content

Blog Article

The ask_apphandoff Gateway: One Tool Instead of Fifty

How AppHandoff collapses dozens of MCP tools into one natural-language gateway: schemas on demand, paraphrased output, and confirmation before any write.

MCPArchitecture

If you've wired more than a couple of MCP servers into an AI agent, you've felt it: the context window fills up before the agent does any work. AppHandoff alone exposes dozens of tools — handoff tickets, contract scans, deploy checks, health dashboards, API specs, agent sessions. Load every schema at connect time and you've spent a large chunk of context describing tools the agent may never call.

AppHandoff's answer is a single gateway tool, ask_apphandoff. You describe what you want in plain English; an LLM router picks the right underlying tool(s), runs them, explains what it did, and asks before anything mutating happens. Here's the problem it solves and how the pattern works.

The tool-sprawl problem

Every tool you register on an MCP server ships its full JSON schema to the model on every request. That's the cost most teams underestimate. The context window tax is the most obvious one: dozens of tools at a few hundred tokens of schema each is real budget — gone before the agent reads your actual prompt. On long sessions it compounds, because the schemas ride along on every turn.

Schema bloat makes it worse. Rich tools have nested parameters, enums, and descriptions, so the more capable your server, the heavier each schema and the worse the ratio of "tools described" to "tools used." Then there's agent confusion: a flat list of similarly named tools (get_handoff_requests, get_ticket, get_ticket_thread, get_ticket_activity, get_ticket_audit_log…) invites wrong picks. The model burns turns choosing, second-guessing, and retrying. More surface area means more ways to misfire.

This isn't an AppHandoff-specific failure — it's structural to MCP. The protocol is generous about what you can expose, and that generosity turns into a tax the moment your server gets useful. We dig into the broader tradeoff in how AppHandoff compares.

The gateway pattern

ask_apphandoff collapses the surface to one entry. The agent sends a natural-language query and a project_id; the gateway does the rest. First it routes: an LLM router maps the intent to the actual AppHandoff tools that satisfy it — one tool or several. Then it invokes, running the routed tools server-side where the full schemas live, so the agent never had to load them. Finally it paraphrases, turning raw tool output into plain language: "I pulled the three open tickets on this project; two are in validation, one is blocked."

Schemas load on demand, on the server, instead of up front in the agent's context. One tool entry replaces dozens in the window. The agent reasons about intent — "close the deploy-blocker ticket" — not about which of five ticket-reading tools to reach for.

There's also list_tools_catalogue for when an agent genuinely wants to browse what's available — discovery without committing the whole catalogue to context, and it works without a project scope. You can see the full surface on the MCP server page.

Three behaviors worth knowing

The AppHandoff gateway layers three features on top of plain routing. F1 — Paraphrase: before and after acting, the handyman restates things in human terms, so you're not parsing JSON to know what happened — you get a sentence. F2 — Confirm flow: mutating tools (closing or updating tickets, triggering rescans, publishing contracts) don't fire blind; the gateway stashes the pending action and returns a pending_confirm shape with a confirm_token. The agent (or human) follows up with { confirm_token, confirm: true } to execute or { confirm: false } to cancel. Read-only calls skip this entirely — no friction where there's no risk. F3 — Codebase-aware routing: the router pre-fetches a project summary as context, so it routes against what your project actually looks like, not a generic guess.

The safety flow, concretely

The paraphrase-plus-confirm design is the part that makes a one-tool gateway safe to hand an autonomous agent. The split is by consequence. Read-only intent ("what's the health of this project?") routes, runs, and answers in one shot — nothing to confirm. Mutating intent ("close HO-42 as fixed") routes, then stops: it returns the paraphrased plan and a confirm_token, and nothing is written until a follow-up confirms.

For power agents that have already reasoned about consequences, allow_writes: true is an express lane that skips the confirm step — with one deliberate exception: destructive tools (the delete_* family) still require allow_writes: true, so a stray natural-language request can never quietly delete something. Two gates, not one.

This is the right shape for AI-built apps: the agent gets velocity on reads, a guardrail on writes, and a hard stop on deletes — all without you hand-wiring confirmation logic per tool. It's the same instinct behind our closed-loop workflow: keep humans and bots reading from one source of truth without surprises.

When you still want direct tools

The gateway is a default, not a mandate. Reach past it when you know exactly which tool you need — if your agent always calls get_health in a tight loop, the direct tool skips a routing LLM hop for lower latency and deterministic behavior. Reach past it when you need deterministic arguments: a natural-language query is interpreted, but a direct call with explicit parameters is not, so scripted or test paths should call the underlying tool. And reach past it when you're latency-sensitive, because the gateway adds an LLM routing pass and a paraphrase pass that aren't worth it for high-frequency, well-known calls.

A good setup uses both: the gateway for the long tail of "do the thing I described," and a handful of direct tools for the hot paths you invoke constantly. The gateway shrinks the context cost of having dozens of capabilities; direct tools keep your three most-used ones fast.

Try it

If you're configuring MCP servers for AI agents and fighting context bloat, the gateway pattern is one of the cheapest wins available — one tool entry, schemas on demand, confirmation built in. See it live in the AppHandoff MCP server at https://api.apphandoff.com/api/mcp-bot, and browse the full tool reference on the MCP server page.