Skip to content

feat(sandbox): add disabled_tools to hide capability tools by name#3648

Open
martimfasantos wants to merge 3 commits into
openai:mainfrom
martimfasantos:feat/sandbox-disabled-tools
Open

feat(sandbox): add disabled_tools to hide capability tools by name#3648
martimfasantos wants to merge 3 commits into
openai:mainfrom
martimfasantos:feat/sandbox-disabled-tools

Conversation

@martimfasantos

@martimfasantos martimfasantos commented Jun 15, 2026

Copy link
Copy Markdown

Summary

Adds a disabled_tools: set[str] field to SandboxAgent that hides specific capability-contributed tools from the model by name.

Today, suppressing a single tool requires writing a per-capability configure_tools callback for each capability that contributes it, which is verbose and inconsistent across capability types. This adds one declarative, top-level switch:

agent = SandboxAgent(
    name="assistant",
    disabled_tools={"view_image", "write_stdin"},
)

The filtering is applied at a single point in prepare_sandbox_agent, right after every capability's tools are gathered into one list. Comparing by .name there covers every tool type uniformly, which matters because the runtime tool-enablement path (Agent.get_all_toolsis_enabled) only inspects FunctionTool and silently skips custom tools. As a result, a tool like apply_patch (a CustomTool) cannot be hidden via is_enabled, but is removed correctly by name-based filtering at aggregation time.

Why disable individual tools

A capability is all-or-nothing: adding it gives the model every tool it ships. But the unit you usually care about is the tool, not the capability, and they don't line up. A capability can bundle a tool you want with one you don't, and dropping the whole capability to remove one tool also throws away the parts you were relying on. configure_tools can suppress a tool, but only per-capability and at the cost of writing a callback for each one.

Common reasons to drop a specific tool while keeping its capability:

  • The tool doesn't fit the deployment. Filesystem ships view_image, but a text-only assistant has no images in its workspace, so the tool is just extra schema and an action the model can waste a turn on.
  • The tool is interactive and the surface isn't. Shell ships write_stdin, which only makes sense when something is at an interactive prompt. In a batch/one-shot setup there's nothing to talk to.
  • Narrower blast radius. Keep Shell for read-only commands but drop the structured patch tool so the model can't edit files, even though both arrive together.
  • Leaner prompt. Every exposed tool is schema the model reads and a branch it can take. Trimming the ones that don't apply keeps the agent focused and cuts per-turn token cost.

Practical example: a documentation assistant wants Shell (to grep and read files) and apply_patch (to propose clean edits), but not view_image (the docs are text) and not write_stdin (nothing interactive is ever running). Today the only clean lever is the capability, so you keep Filesystem and Shell whole and live with view_image/write_stdin on the menu. With disabled_tools={"view_image", "write_stdin"} they're gone, while both capabilities keep doing everything else.

Behavior

  • Works for any capability (shell, filesystem, skills, memory, custom) since filtering is centralized, not per-capability.
  • Composes with configure_tools: the callback runs first to customize a tool, then disabled_tools removes it by name.
  • Names that match no contributed tool are ignored (no error).
  • Tools attached directly to Agent.tools are left untouched; only capability-contributed tools are filtered.
  • Included in the sandbox agent identity signature (run_state._agent_identity_signature), so duplicate-named agents that differ only by disabled_tools stay distinguishable across resume flows.
  • The field is appended to the end of the SandboxAgent dataclass to preserve public positional compatibility.

Known limitation: this removes a tool's schema but does not rewrite instruction text. The default sandbox prompt and capability instructions may still mention a disabled tool by name. The field docstring notes this; prefer disabling tools the base instructions don't reference, or override base_instructions. Making Capability.instructions tool-set-aware is a larger, signature-level change left for a follow-up.

Test plan

Added tests to tests/sandbox/test_runtime_agent_preparation.py covering:

  • field default (empty set) and per-instance isolation
  • removing a named FunctionTool
  • removing a named CustomTool (apply_patch), the case is_enabled cannot handle
  • uniform filtering across multiple capabilities
  • unknown names ignored without raising
  • untargeted tools preserved
  • empty disabled_tools passes everything through
  • Agent.tools not filtered
  • preparation returns a built agent without raising
  • composition with a real Filesystem configure_tools callback (customization survives on a non-disabled tool while a disabled one is removed)

Added a test to tests/test_run_state.py covering identity-signature stability:

  • two same-named SandboxAgents differing only by disabled_tools keep stable identities under reordered handoff graphs

Verification (all green):

  • uv run pytest tests/sandbox tests/test_run_state.py → 1030 passed, 1 skipped
  • uv run ruff format --check and uv run ruff check on changed files → pass
  • uv run mypy on changed source and test files → no issues
  • uv run mkdocs build → succeeds (docs/sandbox/guide.md updated)

Issue number

N/A

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

Add a `disabled_tools: set[str]` field to `SandboxAgent` that removes any
capability-contributed tool whose name matches an entry, applied at the single
point in `prepare_sandbox_agent` where capability tools are merged.

Filtering by name at the aggregation point covers every tool type uniformly,
including custom tools such as `apply_patch` that the runtime tool-enablement
check (`is_enabled`, which only inspects `FunctionTool`) cannot hide. It also
composes with per-capability `configure_tools` callbacks: those run first to
customize a tool, then `disabled_tools` removes it by name. Unknown names are
ignored, and tools attached directly to `Agent.tools` are left untouched.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 62d4714da7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +87 to +88
capability_tools = [
tool for tool in capability_tools if tool.name not in agent.disabled_tools

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove disabled tools from generated instructions too

When disabled_tools includes tools mentioned by sandbox instructions, this filters only the tool object while build_sandbox_instructions(...) still renders the unfiltered default/capability text. In the default configuration src/agents/sandbox/instructions/prompt.md:59 still tells the model to use apply_patch, and Shell.instructions() still says to use exec_command, so SandboxAgent(disabled_tools={"apply_patch"}) or {"exec_command"} hides the schema but leaves a direct instruction to call an unavailable tool. Filter or condition those instruction fragments on the enabled tool set too.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real gap, thanks. I scoped the instruction-text rewrite out of this PR deliberately: Capability.instructions(self, manifest) has no tool-set parameter today, so conditioning fragments on the enabled set would mean changing that signature across every capability and the abstract base, plus string-editing the hardcoded prompt.md default. That's a broader, signature-level change than adding one field, and it deserves its own PR rather than riding along here.

For now I documented the limitation on the field (c6fd4da): disabled_tools removes the tool's schema, and the docstring tells callers to prefer disabling tools the base instructions don't reference, or to override base_instructions. Happy to open a follow-up that makes instructions() tool-set-aware if you'd prefer it bundled.

- Drop the __post_init__ set-coercion: the field is only used in a `not in`
  membership test, which works for any container, so the coercion guarded
  nothing and risked silently splitting a bare string into a char set.
- Shorten the field docstring to one line, matching every other field in the
  dataclass; move the implementation detail to the inline comment.
- Document disabled_tools in docs/sandbox/guide.md (options table + a short
  subsection clarifying how it differs from capabilities and configure_tools).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b2a1b7c95

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

run_as: User | str | None = None
"""User identity used for model-facing sandbox tools such as shell, file reads, and patches."""

disabled_tools: set[str] = field(default_factory=set)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include disabled tools in sandbox identity

When two reachable SandboxAgents share a name and differ only by disabled_tools, the run-state duplicate-name identity logic cannot distinguish them: _agent_identity_signature includes sandbox fields like default_manifest, base_instructions, and capabilities, but not this new field. In resume flows that rebuild the graph with a different handoff order, identities then fall back to traversal order, so a saved sandbox session can be restored onto the sibling agent with a different disabled tool set (for example disabling apply_patch vs view_image). Include disabled_tools in the sandbox agent identity signature as part of adding this behavior-changing field.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed in c6fd4da. disabled_tools is behavior-changing, so it's now included in _agent_identity_signature (sorted, only when non-empty), mirroring how default_manifest/base_instructions/capabilities are appended. Added a regression test (test_build_agent_identity_map_uses_disabled_tools_for_duplicate_names) that builds two same-named SandboxAgents differing only by disabled_tools under reordered handoff graphs and asserts the identities stay stable.

Address review feedback:
- Add disabled_tools to `_agent_identity_signature` so two same-named
  SandboxAgents that differ only by disabled_tools stay distinguishable in
  run-state resume flows that rebuild the agent graph. Mirrors the existing
  conditional handling of default_manifest / base_instructions / capabilities.
- Document that disabled_tools removes a tool's schema but does not rewrite
  instruction text that names the tool; prefer disabling tools the base
  instructions do not reference, or override base_instructions.
@594807047-cpu

This comment was marked as off-topic.

@seratch

seratch commented Jun 15, 2026

Copy link
Copy Markdown
Member

Thanks for sharing this idea, but we don't plan to add this option. For most use cases, the default set of the tools should work well. If you need to restrict some of them, please have your own customized capabilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants