Skip to content

Comments

Claude/fix UI performance#662

Draft
jeremyeder wants to merge 2 commits intoambient-code:mainfrom
jeremyeder:claude/fix-ui-performance-wNNnP
Draft

Claude/fix UI performance#662
jeremyeder wants to merge 2 commits intoambient-code:mainfrom
jeremyeder:claude/fix-ui-performance-wNNnP

Conversation

@jeremyeder
Copy link
Collaborator

No description provided.

Three root causes of UI slowness during the primary user journey:

1. Every SSE token triggered a full React re-render (~20-50/sec during
   streaming). Added requestAnimationFrame-based event batching so
   multiple SSE events within a single frame are processed in one
   synchronous pass. React 18 batches all setState calls within the
   same callback, reducing re-renders from ~50/sec to ~60fps max.

2. sendMessage callback was recreated on every state change because it
   depended on state.threadId, state.runId, and state.status. Replaced
   with refs so the callback is stable across state changes, preventing
   cascading re-renders in child components.

3. Three overlapping polling intervals fired simultaneously during
   session startup: useSession (500-1000ms), workflow queue (2000ms),
   and message queue (2000ms). The latter two were redundant since
   useSession already polls aggressively during transitional states.
   Removed the duplicate intervals.

4. Initial prompt tracking effect had unnecessary deps
   (aguiState.messages.length, aguiState.status) that caused it to
   re-evaluate on every streamed message, even though it only updates
   a ref. Reduced to only depend on session.spec.initialPrompt.

https://claude.ai/code/session_01P6sFQgvLjBHd6zuieQuzyz
- Consolidate three separate state-sync refs (threadId, runId, status)
  into a single stateSnapshotRef with proper typed status field
- Add mountedRef guard to flushEventBuffer to prevent stale setState
  calls after unmount
- Update React version reference in comment (18 → 18+)

https://claude.ai/code/session_01P6sFQgvLjBHd6zuieQuzyz
@github-actions
Copy link
Contributor

github-actions bot commented Feb 21, 2026

Claude Code Review

Summary

This PR addresses real UI jank during agent streaming by tackling four root causes: per-token re-renders, an unstable sendMessage callback, redundant polling intervals, and an over-specified useEffect dep array. The implementation is focused, well-commented, and architecturally sound. No security concerns — this is a pure frontend performance optimization touching only two files.

Issues by Severity

🚫 Blocker Issues

None.

🔴 Critical Issues

None.

🟡 Major Issues

1. Removed polling may miss terminal-state queuing edge case (page.tsx:311-313, 334-336)

The removed setInterval(refetchSession, 2000) effects fired whenever a workflow or message was queued AND the session was not yet Running. This covered all non-Running states, including terminal ones (Stopped, Failed, Completed).

useSession returns refetchInterval: false for terminal states, meaning it stops polling entirely. If a user queues a workflow while the session is Stopped and the desired-phase annotation is not yet applied (or is applied after the queue is checked), there may be a brief window where the session never polls its way to Running.

This is unlikely to occur in normal flow (workflow activation probably sets the annotation first), but worth explicitly verifying or documenting why terminal-state queuing is safe to drop.

🔵 Minor Issues

2. One-render lag in stateSnapshotRef (use-agui-stream.ts:127-133)

The snapshot is updated via useEffect, which runs after the render that caused the state change. If sendMessage is called synchronously in the same event loop tick as a state change (e.g., RUN_FINISHED arrives → state updates → user immediately clicks Send before the next render), the ref still holds the old status/threadId/runId.

In practice this is benign because sendMessage involves a fetch() that yields the microtask queue, and the UI typically prevents sends during streaming. Worth a comment acknowledging the intentional trade-off.

3. processEventRef has a brief stale-closure window (use-agui-stream.ts:638-642)

const processEventRef = useRef(processEvent)
useEffect(() => {
  processEventRef.current = processEvent
}, [processEvent])

Between when processEvent is recreated (new render) and when the useEffect fires, any rAF-batched events will call the previous processEvent. Since processEvent depends on [onEvent, onMessage, onError, onTraceId], this only matters if those callbacks are unstable. In practice the window is sub-frame and extremely unlikely to cause visible issues, but using useLayoutEffect here would close it entirely if it ever becomes observable.

4. initialState defined outside component with shared Map instances (use-agui-stream.ts:58-71) — pre-existing

const initialState: AGUIClientState = {
  pendingToolCalls: new Map(),  // ← created once for the module
  ...
}

These Map objects are created once at module load. While the code never mutates them in-place (it always creates new Map(...) copies before modifying), this is fragile. A future developer could write initialState.pendingToolCalls.set(...) and create a hard-to-debug shared-state bug across multiple hook instances. Not introduced by this PR, but worth converting to a factory function:

const createInitialState = (): AGUIClientState => ({
  pendingToolCalls: new Map(),
  pendingChildren: new Map(),
  messageFeedback: new Map(),
  ...
})
// useState(createInitialState) — React calls the factory once per instance

5. No tests for the new batching behavior

The rAF batching logic (flushEventBuffer, scheduleFlush) is non-trivial and hard to observe visually. A unit test using jest.useFakeTimers() + requestAnimationFrame mocks would guard against regressions (e.g., a future refactor accidentally calling processEvent directly instead of buffering). Not a blocker for a perf fix, but recommended as follow-up.

6. processMessages IIFE not awaited (page.tsx:345-358) — pre-existing

const processMessages = async () => { ... }
processMessages()  // fire-and-forget

Errors inside processMessages are caught per-message, but any unexpected throw would produce an unhandled rejection. Pre-existing issue, not introduced here.

Positive Highlights

  • rAF batching is correctly implemented. eventBufferRef accumulates events synchronously, scheduleFlush() only schedules one rAF per frame, and flushEventBuffer calls processEventRef.current in a single synchronous loop. React 18 automatic batching then collapses all resulting setState calls into one render — exactly as intended.
  • Cleanup is thorough. cancelAnimationFrame is called in both the unmount effect and disconnect(), and eventBufferRef is cleared on disconnect. No leaked rAF handles.
  • sendMessage stabilization is the right approach. Replacing state.threadId, state.runId, state.status deps with stateSnapshotRef eliminates cascading re-renders in all child components that receive sendMessage as a prop. The dep reduction from 6 → 2 deps is the correct minimal set.
  • Removed polling is justified. useSession already polls at 500–1000ms during Pending/Creating/Stopping states and 500ms when a desired-phase annotation is present. The removed 2000ms intervals were strictly slower and redundant.
  • Comments explain the why, not the what. The commit messages and inline comments document the root cause analysis and trade-offs — exactly what's needed for performance changes that could otherwise look like gratuitous complexity.
  • Zero TypeScript any types introduced. Fully typed throughout.
  • No security surface touched. No auth tokens, API keys, or sensitive data flow through this path.

Recommendations

  1. Verify or document the terminal-state queuing edge case (major) — add a comment explaining why the removed polls are safe even if a workflow/message is queued while session is Stopped.
  2. Add a comment to stateSnapshotRef effect acknowledging the one-render lag is intentional and why it's safe.
  3. Convert initialState to a factory function (minor, follow-up) — prevents future shared-Map footguns.
  4. Add unit tests for flushEventBuffer/scheduleFlush (follow-up) using fake timers and mocked requestAnimationFrame.

Overall this is a high-quality, well-reasoned performance fix. The major concern (item 1) is worth confirming before merge but is unlikely to affect real users given normal flow assumptions.

🤖 Generated with Claude Code


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants