Code Mode

What it does

When you connect many MCP servers (8-10 servers, 150+ tools), every request would otherwise carry all those tool definitions in the model’s context - burning most of the budget on reading tool catalogs.

Code Mode keeps the context compact: instead of exposing every tool definition, the model discovers tools on demand and writes a short script that orchestrates them in a single request. On a typical multi-step workflow across ~5 servers you can expect roughly 50% lower token cost and 30-40% faster execution.

When to Use Code Mode

Enable Code Mode if you have:

✅ 3+ MCP servers connected
✅ Complex multi-step workflows
✅ Concerned about token costs or latency
✅ Tools that need to interact with each other

Keep Classic MCP if you have:

✅ Only 1-2 small MCP servers
✅ Simple, direct tool calls
✅ Very latency-sensitive use cases (though Code Mode is usually faster)

You can mix both: Enable Code Mode for “heavy” servers (web, documents, databases) and keep small utilities as direct tools.

Enabling Code Mode

Code Mode must be enabled per MCP client. Once enabled, that client’s tools are discovered on demand and orchestrated through generated code rather than exposed directly.

Best practice: Enable Code Mode for 3+ servers or any “heavy” server (web search, documents, databases).

Enable Code Mode for a Client

Navigate to MCP Gateway in the sidebar
Click on a client row to open the configuration sheet

In the Basic Information section, toggle Code Mode Client to enabled
Click Save Changes

Once enabled, this client’s tools are no longer exposed directly. Instead, the model discovers them on demand and orchestrates them by writing a short script, keeping the request context compact.

Binding Levels

Code Mode supports two binding levels that control how the model discovers tool definitions. This is a global setting that affects context efficiency, not how you call tools.

Server-Level Binding (Default)

Tool definitions are grouped per server. Best for servers with few tools, or when you want simpler discovery (5-20 tools per server).

Tool-Level Binding

Each tool’s definition is loaded individually. Best for servers with many tools (30+ per server) or large/complex schemas, where you want minimal context bloat.

Configuring Binding Level

Binding level is managed from the MCP Gateway settings in the Web UI, and can be viewed in the MCP configuration overview:

Server-level (default): tool definitions grouped per MCP server
Tool-level: tool definitions loaded per individual tool

Auto-Execution with Code Mode

When you run Code Mode under Agent Mode, generated code auto-executes only if every tool it calls is listed in tools_to_auto_execute for its server - if the code touches any tool you have not allow-listed, the whole request is paused and returned to your app for approval.

To control this, set tools_to_auto_execute to exactly the tools you are comfortable running without review. In the example below, code that only calls search runs automatically, while code that calls delete_video is held for approval:

Example:

{
  "name": "youtube",
  "tools_to_execute": ["*"],
  "tools_to_auto_execute": ["search"],
  "is_code_mode_client": true
}

Expected savings at scale

For a workflow spanning ~10 servers and 150 tools (e.g. “find matching products, check inventory, compare prices, get a shipping estimate, create a quote”), switching from classic MCP to Code Mode typically takes a request from 8-10 model turns down to 3-4, and from thousands of tool-definition tokens per turn down to a few hundred - translating to roughly half the cost and half the latency. Actual numbers depend on your servers and prompts.

Next Steps

Agent Mode

Combine Code Mode with auto-execution

Open →

MCP Gateway URL

Expose your tools to external clients

Open →