Skip to content

feat(bench): persist runtime decision points#191

Merged
drewstone merged 4 commits into
mainfrom
feat/bench-runtime-decision-points
Jun 7, 2026
Merged

feat(bench): persist runtime decision points#191
drewstone merged 4 commits into
mainfrom
feat/bench-runtime-decision-points

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Summary

  • record runtime semantic decision points in the benchmark hook recorder
  • persist runtimeDecisionPoints beside runtimeEvents in bench corpus rows
  • thread decision points through SWE/commit0/runLoop benchmark writers
  • add corpus writer coverage proving decision points survive artifact creation

Why

Runtime hooks were emitting semantic decision points, but the benchmark recorder only captured lifecycle events. This keeps belief-state evidence in the same append-only corpus row as the run, without logs or side channels.

Verification

  • pnpm --filter @tangle-network/agent-runtime-bench exec tsx src/corpus.test.mts
  • pnpm typecheck
  • pnpm lint (passes; existing warnings in tests/profiles/coder.test.ts)
  • pnpm test

Note: pnpm --filter @tangle-network/agent-runtime-bench exec tsc --noEmit currently resolves TypeScript 4.5.4 and fails before this change on existing TS config/node type syntax; root pnpm typecheck passes.

drewstone added 2 commits June 8, 2026 00:11
…cision-points

# Conflicts:
#	bench/src/commit0-gate.mts
#	bench/src/corpus.test.mts
@tangletools

Copy link
Copy Markdown
Contributor

⚠️ Review Interrupted — 8fdf12b0

The review runner stopped before publishing a final verdict: webhook_restarted.

State Detail
Interrupted webhook restarted

No review verdict was produced for this run. Trigger a fresh review on the current PR head if the PR is still open.

tangletools · #191 · model: kimi-for-coding · updated 2026-06-07T21:16:35Z

@drewstone drewstone merged commit 0c42a53 into main Jun 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants