Add evidence-first prompt outputs by azalio · Pull Request #122 · azalio/map-framework

azalio · 2026-05-17T17:50:33Z

Summary

Add shared evidence-first output examples for MAP review, debug, and planning prompts
Require quoted evidence before high-risk review verdicts, debug root causes, risks, scores, and decomposition boundaries
Sync shipped templates, document the behavior, move 2604.027 to done, and leave generic JSON prompt linting as follow-up 2604.027-1

Validation

pytest tests/test_skills.py::TestEvidenceFirstPromptContracts tests/test_skills.py::TestSkillStructure::test_local_markdown_supporting_links_resolve tests/test_template_sync.py -v
uv run --no-sync mapify init /var/folders/3j/zmvdy5_56bjcg1kmrx05dltcf7yldq/T/opencode/mapify-evidence-outputs-20260517202350 --no-git --mcp none, then inspected generated reference and skill prompt lines
make lint
pytest -m "not slow"
pytest (attempted; timed out after 15 minutes at tests/integration/test_e2e_claude_sdk.py::TestMapEfficientE2E::test_efficient_produces_code_changes after deterministic tests and the first three slow SDK tests passed)

Follow-up

2604.027-1 tracks generic JSON prompt-contract linting for future MAP skill prompt changes

Copilot

Pull request overview

This PR introduces an “evidence-first” output contract for MAP’s highest-risk workflows (review, debug, plan) by adding a shared reference of compact JSON examples and wiring skill prompts/tests to require quoted evidence before high-impact judgments (verdicts, root causes, risk levels, scores, decomposition boundaries).

Changes:

Added shared evidence-first JSON output examples and linked them from /map-review, /map-debug, and /map-plan skill prompts.
Added regression tests to enforce evidence-first requirements and ensure shared reference files are byte-for-byte synced into shipped templates.
Updated docs and improvement-plan ledger entries to reflect the shipped behavior and split generic linting into follow-up 2604.027-1.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/test_template_sync.py	Adds sync regression coverage for `.claude/references/` → `templates/references/`.
tests/test_skills.py	Adds prompt-contract regression tests requiring evidence-first terms + shared examples coverage.
src/mapify_cli/templates/skills/map-review/SKILL.md	Requires `evidence[]` before verdict/risk/score outputs and links shared examples (shipped template).
src/mapify_cli/templates/skills/map-debug/SKILL.md	Requires quoted logs/tests/code before root-cause/verdict/risk/score outputs and links shared examples (shipped template).
src/mapify_cli/templates/skills/map-plan/SKILL.md	Requires evidence-first spec review + decomposition evidence before subtasks and links shared examples (shipped template).
src/mapify_cli/templates/references/map-output-examples.md	Adds the shipped “Evidence-First Output Examples” reference file.
.claude/skills/map-review/SKILL.md	Mirrors evidence-first prompt contract updates in source-of-truth skill.
.claude/skills/map-debug/SKILL.md	Mirrors evidence-first prompt contract updates in source-of-truth skill.
.claude/skills/map-plan/SKILL.md	Mirrors evidence-first prompt contract updates in source-of-truth skill.
.claude/references/map-output-examples.md	Adds the source-of-truth evidence-first examples reference file.
docs/USAGE.md	Documents evidence-first reviewer/debug/plan output behavior for users.
docs/ARCHITECTURE.md	Documents evidence-first root-cause/validation and review contracts in architecture narrative.
docs/learned/testing-strategies.md	Records learned testing strategy: sync shared references, not just skills.
docs/learned/review-checks.md	Records learned review check about separating shipped prompt behavior vs lint tooling scope.
docs/learned/commands.md	Records learned command workflow: smoke generated evidence references via `mapify init`.
docs/improvement-plan.md	Moves 2604.027 lint tooling into new follow-up 2604.027-1 and clarifies shipped scope.
docs/improvement-loop-log.md	Logs closure of 2604.027 with validation notes and follow-up split.
docs/improvement-done.md	Marks 2604.027 as done with explicit shipped evidence-first slice summary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add evidence-first prompt outputs

7e43d20

Copilot AI review requested due to automatic review settings May 17, 2026 17:50

Copilot started reviewing on behalf of azalio May 17, 2026 17:51 View session

Copilot AI reviewed May 17, 2026

View reviewed changes

Record evidence outputs PR link

b5aa113

azalio merged commit 065b5a8 into main May 17, 2026
6 checks passed

azalio deleted the codex/2604-027-evidence-outputs branch May 17, 2026 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add evidence-first prompt outputs#122

Add evidence-first prompt outputs#122
azalio merged 2 commits into
mainfrom
codex/2604-027-evidence-outputs

azalio commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

azalio commented May 17, 2026

Summary

Validation

Follow-up

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants