refactor(bugfix): convert commands to skills, add CLAUDE.md, clean up#42
Open
jwm4 wants to merge 28 commits intoambient-code:mainfrom
Open
refactor(bugfix): convert commands to skills, add CLAUDE.md, clean up#42jwm4 wants to merge 28 commits intoambient-code:mainfrom
jwm4 wants to merge 28 commits intoambient-code:mainfrom
Conversation
- Convert all 5 commands (reproduce, diagnose, fix, test, document) into
detailed skills in .claude/skills/{name}/SKILL.md
- Rewrite commands as thin wrappers that delegate to skills, passing
arguments and session context
- Add CLAUDE.md with behavioral guidelines extracted from amber.md
(engineering discipline, safety guardrails, quality standards)
- Remove .claude/agents/amber.md (redundant with systemPrompt persona)
- Remove supplementary files (AMBER_RESEARCH.md, FIELD_REFERENCE.md,
TEMPLATE_FEEDBACK.md)
- Update systemPrompt to reference skills alongside commands
- Update README to reflect new structure
- Add .claude/skills/pr/SKILL.md with systematic fork-based PR workflow - Add .claude/commands/pr.md thin wrapper - Handles pre-flight checks, fork setup, branch/commit/push, and PR creation - Includes error recovery for common failures (no push access, sandbox restrictions, missing fork) - Update systemPrompt and startupPrompt to include /pr as phase 6 - Update README with PR phase documentation
Make CLAUDE.md self-contained by including workflow phases, commands, skill locations, and artifact path directly instead of referencing ambient.json (which is platform plumbing, not model-visible). Also fix markdown lint warnings (emphasis-as-heading, code block language).
- Add .claude/skills/review/SKILL.md — critically evaluates fix and tests, issues verdict (inadequate fix / incomplete tests / solid), recommends next steps - Add .claude/commands/review.md thin wrapper - Update /test skill to report verification.md path and summarize results inline to the user - Update systemPrompt, startupPrompt, CLAUDE.md, README with /review as optional phase 5 between test and document
- systemPrompt: add PHASE TRANSITIONS section requiring agent to stop and summarize at end of each phase instead of auto-advancing - CLAUDE.md: add Flow Control section reinforcing pause behavior, trim to ~70 lines per best practices - pr/SKILL.md: add placeholders table defining GH_USER, FORK_OWNER, UPSTREAM_OWNER/REPO, REPO, and BRANCH_NAME with sources - pr/SKILL.md: rewrite Step 2 to ask user before creating fork, wait for confirmation, never silently skip ahead - pr/SKILL.md: replace flat fallback with 4-rung fallback ladder, patch file is now absolute last resort - ambient.json: remove rubric (deferred to future PR), remove startupPrompt and results fields - Delete ambient.clean.json (no longer needed) - AGENTS.md: add sandbox restrictions table
- All command wrappers now say 'Read the file ... now and follow it step by step' instead of 'Using the X skill from ...' which the agent was interpreting as conceptual guidance rather than a file-read instruction - systemPrompt: add CRITICAL prefix to skill-reading instruction, explicitly say 'use the Read tool to open the referenced SKILL.md file' - pr/SKILL.md: add 'IMPORTANT: Follow This Skill Exactly' section at top with anti-improvisation rules - pr/SKILL.md: add pre-flight summary checkpoint between Steps 1 and 2 - pr/SKILL.md: upgrade 'no fork' path from 'STOP' to 'HARD STOP' with explicit instructions to wait for user twice (before fork attempt, after fork attempt fails) - pr/SKILL.md: add fork verification command before proceeding to Step 3 - Add critical rules: never attempt gh repo fork without asking user first, never fall back to patch files without exhausting all other options
- systemPrompt: add step to tell user which skill is being invoked before reading it, so user can correct if wrong phase was picked - CLAUDE.md: reinforce skill announcement policy
- Step 1a: don't dead-end on missing gh auth — continue pre-flight to gather info from git, then present options - Step 1b: add fallback git config when gh is unavailable - Step 1d: add git-remote-based fallback for upstream repo identification - Add decision point after pre-flight: present 3 clear options to user (set up auth, provide fork URL, or prepare for manual push) and WAIT for response instead of dumping all manual instructions at once
Every skill now ends with a 'When This Phase Is Done' section that: - Tells the agent to summarize findings to the user - Recommends the natural next step in the workflow - Explicitly says to stop and wait for the user This reinforces the phase-transition pause from the systemPrompt and CLAUDE.md at the skill level, so the agent gets the instruction to stop regardless of whether it's following the systemPrompt guidance or reading the skill directly. Next-step recommendations follow the natural flow: reproduce → diagnose → fix → test → review → document → pr
- New /assess phase (phase 1): reads the bug report, presents understanding, identifies gaps, proposes reproduction plan, and waits for user confirmation before taking any action. No code execution or cloning in this phase. - systemPrompt: updated to 8-phase flow starting with ASSESS, added explicit instruction to start with assess when user provides a bug report - CLAUDE.md: updated phase list to include assess - pr/SKILL.md: clarified that 'never push to upstream' applies even to org bots and GitHub Apps — no exceptions based on account type - pr/SKILL.md: added reminder at pre-flight summary not to skip Step 2
- systemPrompt: scope 'start with ASSESS' to initial bug report only - systemPrompt: add 'INTERPRETING USER RESPONSES' section so agent maps phase names and confirmations to the correct next phase - assess skill: allow cloning and reading code (but not executing) - assess skill: add Step 2 to check for repo and clone if missing, read referenced PRs/files to inform assessment - assess skill: update output to note repo is cloned for later phases
- Add awareness of remaining phases (review, document, pr) so agent recommends the full path instead of jumping to PR
… via Task tool Instead of the orchestrator reading and executing skill files directly (which caused context bleed and skill-selection failures), Amber now dispatches each phase to a subagent using the Task tool. Changes: - systemPrompt declares Amber as an orchestrator with dispatch protocol - Command wrappers instruct dispatch via Task tool (not direct execution) - Skill 'When This Phase Is Done' sections return results (not user-facing) - CLAUDE.md reinforces orchestrator role and subagent dispatch pattern
Replace systemPrompt-based orchestration with a dedicated controller skill (.claude/skills/controller/SKILL.md) that the model reads and follows. The controller has: - Phase table mapping commands to skill paths - Explicit dispatch protocol via Task tool - User response interpretation table (yes → last recommendation, not repeat) - Critical rule: never re-dispatch the phase that just finished systemPrompt is now minimal: identity + pointer to controller. Command wrappers all route through the controller. CLAUDE.md references the controller instead of restating flow rules.
…subagents Switch from Task tool dispatch to direct execution so users see full progress in the platform UI. The controller re-reads itself after each phase to keep transition rules fresh and prevent getting stuck. This preserves the correct transition behavior while restoring visibility.
Move all next-step logic out of individual skills and into the controller. Skills now only report findings; the controller decides what to recommend. Recommendations are flexible: the controller considers skipping forward, going back, or ending early based on what actually happened, and presents multiple options instead of a single rigid next step.
… descriptions - Remove next-step recommendations from all 8 skills (they just report findings) - Controller owns all recommendation logic with flexible guidance - Add one-sentence descriptions to each phase in the controller - Remove response-interpretation table (model handles this naturally)
…ncements - systemPrompt and CLAUDE.md: re-read controller when choosing/starting a phase - Controller: explicit phase announcement with example before execution - Avoids re-reading on every message (context budget concern)
- pr/SKILL.md: use .parent.owner.login and .parent.name instead of non-existent .parent.nameWithOwner in fork detection query - ADR: add lesson 13 (test API responses against real data), lesson 5 (controller re-read timing), lesson 6 (phase descriptions improve recommendations); update lesson 2 (drop response-interpretation table); renumber to 15 total lessons
…ontroller - pr/SKILL.md: recognize bot 403 as expected, use GitHub compare URL for manual PR creation instead of bare fork URL, prevent patch-file spiral - review/SKILL.md: remove A/B/C verdict labels (meaningless to users) - controller/SKILL.md: add 'continue to next step' as default recommendation, clarify recommendation timing - ADR: add lesson 14 (GitHub App on user ≠ permission on upstream), now 16 total
- All 8 skills now end with 'Then re-read the controller for next-step guidance' - Removed redundant re-read instructions from controller, systemPrompt, CLAUDE.md - Controller Step 4 now acknowledges the skill-triggered re-read instead of duplicating it - Net effect: one place triggers the re-read (end of skill), one place handles it (controller)
Contributor
|
This is solid, I'll load it as a custom workflow and see how we do. Thanks! |
Contributor
Author
|
Oh, I forgot to mention this in the PR description. Here are three things that I think should be in this workflow but are not in this PR because I wanted to break up the work a little:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bugfix workflow: skill-based architecture with workflow controller
Problem
The bugfix workflow had all process logic inline in command files. This meant users only got the workflow's carefully designed process if they typed the exact slash command — saying "fix this bug" in natural language would bypass all of it. Phase transitions were unreliable, the agent would auto-advance without pausing, and the PR creation process failed in predictable ways that weren't handled.
Changes
Architecture: commands → skills → controller
Refactored from a command-heavy design to a three-layer architecture:
.claude/skills/controller/SKILL.md) — owns all phase transitions, recommendations, and flow logic. The single source of truth for "what happens next.".claude/skills/{name}/SKILL.md) — contain the full multi-step process for each phase. Report findings only; never recommend next steps..claude/commands/*.md) — thin pointers (~5 lines each) that route to the controller.This means users get the workflow's benefit whether they type
/fixor "fix this bug."New phases:
/assess— initial phase that reads the bug report, summarizes understanding, identifies gaps, and proposes a plan before jumping into reproduction/review— optional phase after/testthat critically evaluates the fix and tests, looking for gaps and missed edge cases/pr— systematic, failure-resistant pull request creation with fork workflows, bot identity handling, and a 4-rung fallback ladderNew files:
workflows/bugfix/CLAUDE.md— behavioral guidelines (principles, hard limits, safety, quality, escalation)docs/adr/2026-02-12-bugfix-lessons-learned.md— 16 lessons learned about prompt engineering for multi-phase AI workflowsDeleted files:
.claude/agents/amber.md— redundant with systemPrompt; useful content moved to CLAUDE.mdAMBER_RESEARCH.md,FIELD_REFERENCE.md,TEMPLATE_FEEDBACK.md— supplementary docs not used by the workflow.ambient/ambient.clean.json— unused duplicateSimplified systemPrompt:
Reduced from ~100 lines to ~10 lines. The systemPrompt now just identifies the agent, points at the controller, and lists the workspace layout. All behavioral rules live in
CLAUDE.md; all process logic lives in skills.Key design decisions
/installation/repositories, uses the correctjqfields for fork detection, and treatsgh pr create403 as an expected outcome (provides a GitHub compare URL) rather than an error to debug.Testing
Iteratively tested on the Ambient Code Platform across multiple sessions. The ADR documents all issues encountered and how they were resolved.