Skip to content

[codex] Build coding-deepgent local parity baselines#220

Open
kun1s2 wants to merge 239 commits intoshareAI-lab:mainfrom
kun1s2:codex/stage-12-14-context-compact-foundation
Open

[codex] Build coding-deepgent local parity baselines#220
kun1s2 wants to merge 239 commits intoshareAI-lab:mainfrom
kun1s2:codex/stage-12-14-context-compact-foundation

Conversation

@kun1s2
Copy link
Copy Markdown

@kun1s2 kun1s2 commented Apr 14, 2026

Summary

Build the current coding-deepgent local parity baselines:

  • keep Approach A MVP as a verified historical baseline
  • complete Circle 1 local daily-driver parity baseline
  • complete a local Circle 2 expanded parity baseline
  • keep hosted SaaS ingress, multi-user auth, public marketplace backend, and cross-machine workers explicitly out of scope

This PR is no longer just an MVP closeout branch. It now represents the current local product baseline for coding-deepgent.

Scope

Historical MVP closeout included on this branch

The branch still contains the earlier Stage 12-29 work:

  • context / compact / session / recovery hardening
  • durable task / plan / verifier boundaries
  • bounded subagent / fork runtime
  • local extension foundation
  • observability / evidence closeout
  • canonical MVP dashboard and deferred-boundary ADR work

Circle 1 local daily-driver parity baseline

Implemented on this branch:

  • runtime-core parity pack
  • session inspect / history / projection / timeline / evidence / permissions CLI surfaces
  • durable task / plan control CLI surfaces
  • active TUI background-subagent controls and runtime snapshots
  • local skills / MCP / hooks / plugins inspect / validate / debug surfaces
  • deterministic coding-deepgent acceptance circle1

Circle 2 local expanded parity baseline

Implemented on this branch:

  • durable local event_stream
  • durable local worker_runtime
  • local mailbox
  • local teams orchestration records
  • local remote session/control records and replay
  • local extension_lifecycle
  • local continuity artifacts
  • deterministic coding-deepgent acceptance circle2

What This PR Delivers Now

The branch establishes a coherent local product baseline with:

  • one persistent local runtime store
  • one session/evidence/recovery model
  • one CLI/TUI product surface
  • one local extended-control substrate

Explicit Non-Goals / Deferred Beyond This PR

These are still not claimed by the current local baseline:

  • hosted remote/session ingress service
  • multi-user auth / org policy / billing
  • public marketplace backend
  • cross-machine workers or true distributed daemon supervision
  • true IDE plugin implementation beyond local remote-control records and surfaces

Reviewer Guide

Recommended review order:

  1. .trellis/project-handoff.md
  2. .trellis/plans/coding-deepgent-full-cc-parity-roadmap.md
  3. .trellis/plans/coding-deepgent-circle-2-expanded-parity-plan.md
  4. coding-deepgent/src/coding_deepgent/ runtime / session / task / subagent / event / worker / mailbox / team / remote / lifecycle / continuity domains
  5. coding-deepgent/src/coding_deepgent/cli.py and cli_service.py
  6. coding-deepgent/frontend/cli and frontend bridge protocol updates
  7. coding-deepgent/tests

Validation

Current branch validation:

  • pytest -q coding-deepgent/tests -> 438 passed
  • npm --prefix coding-deepgent/frontend/cli test -> passed
  • npm --prefix coding-deepgent/frontend/cli run typecheck -> passed
  • ruff check coding-deepgent/src/coding_deepgent coding-deepgent/tests .trellis/spec .trellis/plans -> passed
  • python3 -m mypy coding-deepgent/src/coding_deepgent -> passed
  • PYTHONPATH=coding-deepgent/src python3 -m coding_deepgent acceptance circle1 -> passed
  • PYTHONPATH=coding-deepgent/src python3 -m coding_deepgent acceptance circle2 -> passed

Residual Risks

  • The PR is broad and spans the full transition from MVP closeout to local Circle 1/2 baselines, so review load is still high even after body cleanup.
  • Circle 2 currently provides a local expanded baseline, not hosted remote parity.
  • If future work wants true SaaS ingress, distributed workers, or marketplace backend, it should be scoped as a new explicit post-baseline phase rather than silently extending the current local model.

CrazyBoyM and others added 30 commits April 8, 2026 05:45
The OMX team runtime writes local state under .omx/, and worker worktrees require the leader workspace to be clean before launch. Committing the ignore rule preserves local orchestration artifacts outside source control while unblocking durable team execution.

Constraint: omx team refuses to launch with a dirty leader workspace because it provisions worker worktrees
Rejected: Stash .gitignore before launch | would make .omx/ unignored again during team execution
Confidence: high
Scope-risk: narrow
Directive: Keep .omx/ ignored; do not remove unless replacing the OMX state location
Tested: git diff showed only .omx/ ignore addition
Not-tested: team launch after commit
The first LangChain milestone needs CI evidence that the parallel s01-s06 track exists, compiles without OpenAI credentials, avoids import-time model starts, and preserves visible teaching harness primitives. This adds the guardrail tests and wires CI through requirements.txt so later LangChain dependency additions are installed consistently.

Constraint: Test lane owns tests/CI while code lane still owns agents_langchain implementation

Confidence: medium

Scope-risk: narrow

Tested: python -m py_compile tests/test_langchain_agents_smoke.py; python -m pytest tests/test_agents_smoke.py -q

Not-tested: tests/test_langchain_agents_smoke.py passes only after agents_langchain s01-s06 code lane lands
The docs lane needs a stable comparison entry point before the code and test lanes are integrated, so this records where the s01-s06 LangChain/OpenAI-interface track lives, how it should be configured, and how reviewers should keep it separate from the original agents/ baseline and web UI.

Constraint: First milestone is s01-s06 only and must preserve agents/ plus web/ boundaries

Constraint: LangChain docs currently install core langchain plus langchain-openai for OpenAI integration

Rejected: Surface the track through web/ now | user explicitly scoped web UI/app out of this milestone

Confidence: high

Scope-risk: narrow

Tested: python -m pytest tests/test_agents_smoke.py -q; python -m compileall agents tests -q; git diff --check; python -m pip install --dry-run -r requirements.txt pytest

Not-tested: full pytest suite due pre-existing tests/test_s_full_background.py failure unrelated to docs/deps changes
The docs lane needs a stable comparison entry point before the code and test lanes are integrated, so this records where the s01-s06 LangChain/OpenAI-interface track lives, how it should be configured, and how reviewers should keep it separate from the original agents/ baseline and web UI.

Constraint: First milestone is s01-s06 only and must preserve agents/ plus web/ boundaries

Constraint: LangChain docs currently install core langchain plus langchain-openai for OpenAI integration

Rejected: Surface the track through web/ now | user explicitly scoped web UI/app out of this milestone

Confidence: high

Scope-risk: narrow

Tested: python -m pytest tests/test_agents_smoke.py -q; python -m compileall agents tests -q; git diff --check; python -m pip install --dry-run -r requirements.txt pytest

Not-tested: full pytest suite due pre-existing tests/test_s_full_background.py failure unrelated to docs/deps changes
Add a parallel agents_langchain s01-s06 track so learners can compare the existing hand-written Anthropic SDK baseline against LangChain's OpenAI-interface runtime without changing the web UI or original agents.

Constraint: First milestone is s01-s06 only and must preserve agents/*.py plus web/

Rejected: Put LangChain files under agents/ | risks confusing the existing web extractor and baseline teaching boundary

Confidence: high

Scope-risk: moderate

Tested: python -m py_compile agents_langchain/*.py; python -m pytest tests/test_agents_smoke.py tests/test_langchain_agents_smoke.py -q; env -u OPENAI_API_KEY import check for agents_langchain modules
The first LangChain milestone needs to sit beside the hand-written Anthropic SDK lessons, not replace them, so this adds a separate agents_langchain package, non-live smoke tests, OpenAI-style setup docs, and CI dependency wiring while leaving the web app and original s01-s06 scripts unchanged.

Constraint: Preserve existing agents/*.py as the baseline and avoid web UI/app changes for this milestone
Constraint: Automated tests must not require OPENAI_API_KEY or network access
Rejected: Put LangChain files under agents/ | would blur the baseline boundary and risk web extractor churn
Confidence: high
Scope-risk: moderate
Tested: python -m py_compile agents_langchain/*.py tests/test_langchain_agents_smoke.py
Tested: python -m pytest tests/test_agents_smoke.py tests/test_langchain_agents_smoke.py -q
Tested: env -u OPENAI_API_KEY python -m pytest tests/test_langchain_agents_smoke.py -q
Not-tested: Full pytest suite is blocked by pre-existing tests/test_s_full_background.py failure in unmodified agents/s_full.py
Not-tested: Live LangChain/OpenAI calls intentionally not run
The integrated LangChain milestone passed its targeted checks, but full repository pytest still failed in BackgroundManagerTests because a running background task with result=None rendered as '[running] None'. Normalizing the None case to the existing running placeholder keeps the capstone behavior aligned with the test and avoids a misleading status string.

Constraint: Full post-change verification should pass before concluding the milestone
Rejected: Leave the unrelated failure unresolved | would keep full pytest red at handoff time
Confidence: high
Scope-risk: narrow
Directive: Preserve the '(running)' placeholder contract for unfinished background tasks unless tests and user-visible output are updated together
Tested: python -m py_compile agents/s_full.py agents_langchain/*.py tests/test_langchain_agents_smoke.py; python -m pytest tests -q
Not-tested: Interactive manual run of agents/s_full.py background task commands
Kun added 26 commits April 19, 2026 23:26
@kun1s2 kun1s2 changed the title [codex] Close coding-deepgent MVP local agent harness core [codex] Build coding-deepgent local parity baselines Apr 19, 2026
@kun1s2 kun1s2 marked this pull request as ready for review April 19, 2026 22:30
@kun1s2
Copy link
Copy Markdown
Author

kun1s2 commented Apr 19, 2026

Release-validation is complete. Current status:\n\n- acceptance circle1: pass\n- acceptance circle2: pass\n- pytest -q coding-deepgent/tests: 438 passed\n- frontend CLI test/typecheck: pass\n- ruff + mypy: pass\n- PR title/body refreshed to current local baseline scope\n- branch rebased/merged with latest upstream main and README conflicts resolved\n\nThe PR is now mergeable from a code/conflict standpoint. The only remaining blockers are repository-side permissions and external Vercel authorization checks. My GitHub identity does not have permission to execute MergePullRequest for this repo, so merge must be completed by a maintainer or someone with merge rights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants