Draft
Conversation
Wires GET /v1/denials?last=N on the sandbox-local policy advisor API to read recent OCSF JSONL events from /var/log/openshell-ocsf.YYYY-MM-DD.log, filter to network/L7 denials (action_id=2, class_uid 4001/4002), and return a compact summary newest-first. Default limit is 10, capped at 100. Ran inside spawn_blocking so file I/O does not block the policy.local handler. Other cleanup: - POST /v1/proposals now uses the typed grpc_client wrapper instead of raw_client, so accepted/rejected counts surface to the agent uniformly. Wrapper return type extended to the response struct. - Drop the 'add_rule' snake_case alias in the proposal JSON; canonical form is camelCase 'addRule', matching the PolicyMergeOperation convention used elsewhere. - skills/policy_advisor.md updated to match: documents the now-real /v1/denials?last=10 endpoint and uses 'addRule' consistently. - skills.rs test asserts on the canonical 'addRule' phrase rather than the removed 'PolicyMergeOperation' substring.
format_endpoint() previously rendered only host:port, dropping protocol, access, and the L7 rules array. That made openshell rule get text output unable to distinguish a broad L4 grant from a method/path-scoped L7 REST rule -- exactly the distinction a developer needs at approval time. New rendering tags each endpoint with its enforcement layer and surfaces allow/deny rules: bare L4: api.example:443 [L4] L7 read-only: api.example:443 [L7 rest, access=read-only] L7 method/path: api.example:443 [L7 rest, allow PUT /v1/foo/bar] Pure display change: no proto, gateway, or behavior changes. Unit test covers all three rendering cases with synthetic fixtures.
Re-shape examples/agent-driven-policy-management/ to be a single, clean end-to-end demonstration of the agent-driven policy loop. A Codex agent inside an OpenShell sandbox attempts a GitHub Contents API write, hits a structured 403 from the L7 proxy, reads the policy_advisor skill, drafts a narrow addRule proposal via http://policy.local/v1/proposals, the host auto-approves, the sandbox hot-reloads policy, and the agent's retry succeeds. Whole loop runs in roughly two minutes. Demo cleanup: - Drop .env file ceremony. Defaults resolve from gh: owner via 'gh api user --jq .login', repo defaults to 'openshell-policy-demo', token from gh auth token / GITHUB_TOKEN / GH_TOKEN. With gh auth login and codex login already done, 'bash demo.sh' Just Works. - Codex-specific. Bootstraps ~/.codex/auth.json from credentials injected by the OpenShell provider, runs codex exec --sandbox danger-full-access (OpenShell is the actual security boundary; bwrap nesting cannot create user namespaces inside the sandbox container). - Tighter narrative output: a single 'Preflight' step, a run summary banner before launch, an inline narration of what's happening inside the sandbox while we poll for the proposal (including the literal structured 403 body the agent acts on), and an OCSF trace at the end filtered to the three events that tell the story (DENY, RELOAD, ALLOW). - Replace Python heredoc templating with sed; uploads use the single-flag pattern (--upload "${PAYLOAD_DIR}:/sandbox") with files referenced at the basename-prefixed path that #952 / #1028 established. - README documents the trust model honestly: structured rule is the contract, agent rationale is a hint, prover validation badge in progress per RFC 0001. Move the deterministic no-LLM regression harness out of examples/ into e2e/policy-advisor/ -- it was a parallel demo, not an example. Same loop without the LLM, useful for iterating on the proxy and policy.local API.
Whitespace-only fixups caught by mise run pre-commit. No functional change.
The demo task is mechanical (one HTTP request, parse a structured 403, post a JSON proposal, retry). Codex's default high-effort reasoning roughly doubles the demo's wall time without improving outcomes; running at 'low' lands the same minimal L7 grant in roughly half the time. Override with DEMO_CODEX_REASONING=medium (or higher) to compare runs.
johntmyers
reviewed
May 4, 2026
johntmyers
reviewed
May 4, 2026
Three changes addressing review feedback before merging the agent-driven policy management MVP: - Distinguish "OCSF JSONL enabled, no denials" from "OCSF JSONL disabled, nothing to read." The endpoint now returns a `log_available` flag and an explanatory `note` when the log file is missing, so the in-sandbox agent can give the developer an accurate hint instead of a misleading empty list. - Stop echoing the OCSF `message` field in the per-denial summary. The proxy's denial messages can include the request path with query string (e.g., `?access_token=...`); the structured `host`/`port`/`method`/ `path`/`binary` fields carry everything the agent needs to draft a proposal, and `path` is sourced from `http_request.url.path` which already excludes the query string. - Cap `read_request_body` at a 15s timeout. Bounds slowloris-style stalls from a misbehaving in-sandbox process. The proxy listener only accepts loopback connections so practical impact is small, but this is cheap defense-in-depth. New tests cover the missing-log signal and the message-redaction guarantee.
…_DIR Two small hardening passes on the policy management demo: - `fail()` now pipes the agent log tail through a redactor that masks the GitHub token and Codex credential triple before printing. Codex itself is well-behaved about not echoing the token, but a misbehaving tool call could leak it; this is a final safety net before the log hits the developer's terminal (and any clipboard or chat history that follows). - `validate_env` now regex-checks DEMO_FILE_DIR with the same allow-list the other path-shaped variables use. The value is interpolated through sed with `|` as the delimiter when rendering the agent task; rejecting unsupported characters keeps the templating predictable and stops a user-supplied value from breaking out into a shell context.
Addresses review feedback that the deny body's `next_steps` array and the route table could drift apart. The route paths and skill location now live as `pub const`s in `policy_local.rs` and feed both: - the dispatcher in `route_request` that matches against them - a new `agent_next_steps()` helper that builds the JSON the L7 deny body embeds `l7/rest.rs::deny_response_body` calls `policy_local::agent_next_steps()` instead of inlining the array, so adding or renaming a route is a one-line change in `policy_local.rs` and the agent contract follows automatically.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MVP of agent-driven policy management — the loop where an in-sandbox agent
hits a policy block, drafts a narrow rule, submits it via
policy.local,and the developer approves out-of-band. Implements the vertical slice
described in RFC 0001 and
tracked under #1062, with the build plan in
architecture/plans/agent-driven-policy-management-v1.md.The full loop demoed end-to-end with Codex inside an OpenShell sandbox
writes a markdown file to GitHub via a proposal-and-approval round-trip
that takes about two minutes total.
Related Issue
Refs #1062.
Demo output
Real run from a local dev gateway against a scratch GitHub repo. The whole
loop is
bash examples/agent-driven-policy-management/demo.shwith noarguments — defaults resolve from
ghand~/.codex/auth.json.Two things worth calling out from this output:
Endpoints:line shows the actual L7 grant —[L7 rest, allow PUT /repos/.../...md]. Codex submits a method/path-scoped REST rule, not a broad L4 allow. Before this PR,openshell rule getrendered onlyhost:portand droppedprotocol,access, and therulesarray, so a developer at approval time couldn't tell L4 from L7. The CLI rendering fix surfaces what the agent already submits.layer,method,path,rule_missing, andnext_steps— enough for the agent to recover without prompt scaffolding telling it which file to read. Reading the skill is one of thenext_stepsthe response itself names.Changes
feat(sandbox): wire policy.local denials to OCSF JSONL logGET /v1/denials?last=Non the sandbox-local API now reads the OCSFJSONL log at
/var/log/openshell-ocsf.YYYY-MM-DD.log, filters to networkand L7 denials (
action_id=2,class_uid4001/4002), and returns acompact summary newest-first. Default limit 10, capped at 100. Runs in
spawn_blockingso file I/O does not stall the policy.local handler.POST /v1/proposalsnow uses the typedgrpc_clientwrapper instead ofraw_client. Wrapper return type extended to the response struct soaccepted/rejected counts surface uniformly.
add_rulesnake_case alias in proposal JSON; canonicalform is
addRule, matchingPolicyMergeOperationconvention usedelsewhere in the codebase.
skills/policy_advisor.mdupdated to document the now-real/v1/denials?last=10endpoint and useaddRuleconsistently.feat(cli): show L7 protocol/method/path in rule get outputformat_endpoint()previously rendered onlyhost:port, droppingprotocol,access, and the L7rulesarray. That madeopenshell rule gettext output unable to distinguish a broad L4 grantfrom a method/path-scoped L7 REST rule — exactly the distinction a
developer needs at approval time.
New rendering tags each endpoint with its enforcement layer and surfaces
allow/deny rules:
Pure display change: no proto, gateway, or behavior changes. Unit test
covers all three cases with synthetic fixtures.
refactor(examples): rewrite policy demo as Codex-default loopRe-shaped
examples/agent-driven-policy-management/as a single, cleanend-to-end demonstration with smart defaults —
bash demo.shworks aftergh auth loginandcodex login, with no.envceremony or requiredarguments. Defaults resolve from
gh(owner viagh api user, repodefaults to
openshell-policy-demo, token fromgh auth token).Demo output narrates the loop for a developer reading along: structured
deny body the agent receives, the agent's drafted proposal (now showing
the L7 method/path), the policy hot-reload, and the OCSF trace at the end
filtered to the three story-relevant events.
Moved the deterministic no-LLM regression harness out of
examples/intoe2e/policy-advisor/— it was a parallel demo, not an example. Same loopwithout the LLM, useful for iterating on the proxy and
policy.localAPI.The README documents the trust model honestly: structured rule is the
contract, agent rationale is a hint, prover validation badge in progress
per RFC 0001 Phase 3.
Testing
cargo test -p openshell-sandbox --lib(604 tests, all pass; 6 arenew in
policy_local::tests)cargo test -p openshell-cli format_endpoint(renderer unit testcovers L4, L7 read-only, L7 method/path)
cargo clippy -p openshell-sandbox --lib --tests -- -D warnings(clean)shellcheckclean ondemo.sh,sandbox-agent.sh,e2e/policy-advisor/test.sh,e2e/policy-advisor/sandbox-runner.shbash examples/agent-driven-policy-management/demo.shagainst a local gateway with Codex auth and a scratch GitHub repo.
Confirmed Codex submits a method/path-scoped L7 REST rule (visible
after the CLI rendering fix) and that hot-reload + retry works.
Checklist
feat(sandbox),feat(cli),refactor(examples))--signoffif maintainerswant; matched the existing branch style for now
describe this scope; no new architecture doc needed
The renderer surfaces what the agent submitted; if a future agent
defaults to L4 against a known-REST host, that signal belongs in the
gateway-side prover (Phase 3), not in the prompt.
Out of scope (deferred per the build plan)
policy.localis the agentsurface this PR ships)
Side note
While testing this branch I noticed
examples/multi-agent-notepad/demo.shregressed against
mainafter #952 / #1028 changed--upload <dir>:<dir>semantics. Filed as #1147 with a five-line suggested fix. Not in scope here.