Skip to content

Code Mode

When you connect many MCP servers (8-10 servers, 150+ tools), every request would otherwise carry all those tool definitions in the model’s context - burning most of the budget on reading tool catalogs.

Code Mode keeps the context compact: instead of exposing every tool definition, the model discovers tools on demand and writes a short script that orchestrates them in a single request. On a typical multi-step workflow across ~5 servers you can expect roughly 50% lower token cost and 30-40% faster execution.

Enable Code Mode if you have:

  • ✅ 3+ MCP servers connected
  • ✅ Complex multi-step workflows
  • ✅ Concerned about token costs or latency
  • ✅ Tools that need to interact with each other

Keep Classic MCP if you have:

  • ✅ Only 1-2 small MCP servers
  • ✅ Simple, direct tool calls
  • ✅ Very latency-sensitive use cases (though Code Mode is usually faster)

You can mix both: Enable Code Mode for “heavy” servers (web, documents, databases) and keep small utilities as direct tools.


Code Mode must be enabled per MCP client. Once enabled, that client’s tools are discovered on demand and orchestrated through generated code rather than exposed directly.

Best practice: Enable Code Mode for 3+ servers or any “heavy” server (web search, documents, databases).

  1. Navigate to MCP Gateway in the sidebar
  2. Click on a client row to open the configuration sheet
MCP Client Configuration
  1. In the Basic Information section, toggle Code Mode Client to enabled
  2. Click Save Changes

Once enabled, this client’s tools are no longer exposed directly. Instead, the model discovers them on demand and orchestrates them by writing a short script, keeping the request context compact.


Code Mode supports two binding levels that control how the model discovers tool definitions. This is a global setting that affects context efficiency, not how you call tools.

Tool definitions are grouped per server. Best for servers with few tools, or when you want simpler discovery (5-20 tools per server).

Each tool’s definition is loaded individually. Best for servers with many tools (30+ per server) or large/complex schemas, where you want minimal context bloat.

Binding level is managed from the MCP Gateway settings in the Web UI, and can be viewed in the MCP configuration overview:

MCP Gateway Configuration
  • Server-level (default): tool definitions grouped per MCP server
  • Tool-level: tool definitions loaded per individual tool

When you run Code Mode under Agent Mode, generated code auto-executes only if every tool it calls is listed in tools_to_auto_execute for its server - if the code touches any tool you have not allow-listed, the whole request is paused and returned to your app for approval.

To control this, set tools_to_auto_execute to exactly the tools you are comfortable running without review. In the example below, code that only calls search runs automatically, while code that calls delete_video is held for approval:

Example:

{
"name": "youtube",
"tools_to_execute": ["*"],
"tools_to_auto_execute": ["search"],
"is_code_mode_client": true
}

For a workflow spanning ~10 servers and 150 tools (e.g. “find matching products, check inventory, compare prices, get a shipping estimate, create a quote”), switching from classic MCP to Code Mode typically takes a request from 8-10 model turns down to 3-4, and from thousands of tool-definition tokens per turn down to a few hundred - translating to roughly half the cost and half the latency. Actual numbers depend on your servers and prompts.


Agent Mode

Combine Code Mode with auto-execution

Open →

MCP Gateway URL

Expose your tools to external clients

Open →