feat(a2a): implement full A2A task lifecycle state support by agent-of-mkmeral · Pull Request #2245 · strands-agents/sdk-python

agent-of-mkmeral · 2026-05-04T19:22:29Z

Summary

Implements full A2A task lifecycle state support in the Strands SDK. Previously, A2A tasks could only end in completed — errors crashed, cancellation was unsupported, and there was no way for an agent to ask the user for more input mid-task. This PR adds proper support for failed, canceled, input_required, rejected, and auth_required states.

Related to: #1371

What Actually Changes (Before → After)

1. Agent Errors No Longer Crash the Task

	Before	After
Behavior	If `agent.stream_async()` threw an exception, it propagated as an unhandled `ServerError` — the client got a JSON-RPC error and no task state transition	The exception is caught; the task gracefully transitions to `TaskState.failed` with a status message
Client sees	`{"error": {"code": -32603, "message": "Internal error"}}`	A proper `TaskStatusUpdateEvent` with `state: "failed"` and a message like "Agent execution failed"
Task state	Stuck in `working` forever (zombie task)	Cleanly in `failed` (terminal state)

2. Cancellation Now Works

	Before	After
Behavior	`cancel()` always raised `UnsupportedOperationError`	`cancel()` transitions the task to `canceled` and attempts cooperative agent shutdown
Client sees	Error response saying operation not supported	`TaskStatusUpdateEvent` with `state: "canceled"` + message
Agent	Not signaled	If agent has a `cancel()` method, it's called (best-effort)

3. Strands Interrupts → A2A `input_required`

This is the big one. Strands agents that use interrupts can now participate in multi-turn A2A workflows.

	Before	After
Behavior	If agent returned with `stop_reason="interrupt"`, the executor ignored it and called `updater.complete()` anyway	The interrupt is detected; task transitions to `input_required` with a status message listing what input is needed
Client sees	Premature `completed` with possibly empty/partial content	`TaskStatusUpdateEvent` with `state: "input-required"` + message describing each interrupt (name + reason)
Workflow	One-shot only — no HITL (human-in-the-loop) possible	Client can send a follow-up message to resume the task

Example flow:

Client → send_message("Book me a flight to NYC")
Server → TaskStatus: working (streaming...)
Server → TaskStatus: input_required
         message: "Agent requires input:\n- confirm_booking: Please confirm: NYC flight on May 10 for $450"
Client → send_message("Yes, confirm it")
Server → TaskStatus: working (streaming...)
Server → TaskStatus: completed
         artifact: "Flight booked! Confirmation: ABC123"

4. Client-Side (`A2AAgent`) Handles All States

	Before	After
`_is_complete_event()`	Only recognized `TaskState.completed`	Recognizes all 6 lifecycle states as "stream complete" events
`AgentResult.stop_reason`	Always `"end_turn"`	`"end_turn"` for terminal states (completed/failed/canceled/rejected); `"interrupt"` for pausing states (input_required/auth_required)
`AgentResult.state`	Empty `{}`	Contains `{"a2a_task_state": "failed"}` (or whichever state) for programmatic inspection

Impact on A2A Tasks

Task State Machine (what this PR enables)

                    ┌──────────────────────────────────────────┐
                    │                                          │
  ┌─────────┐   ┌──▼───────┐   ┌────────────────┐   ┌─────────────┐
  │submitted│──►│ working  │──►│ input_required │──►│  completed  │
  └─────────┘   └──────────┘   └────────────────┘   └─────────────┘
                    │                                    
                    ├──────────────────────►┌────────┐
                    │         error         │ failed │
                    │                       └────────┘
                    │                                    
                    ├──────────────────────►┌──────────┐
                    │       cancel()        │ canceled │
                    │                       └──────────┘
                    │                                    
                    └──────────────────────►┌──────────┐
                          (future)          │ rejected │
                                            └──────────┘

Who Is Affected

User	Impact
A2A server operators (using `StrandsA2AExecutor`)	Tasks no longer zombie on errors. Cancel button works. Interrupt-based agents can now be exposed via A2A.
A2A clients (using `A2AAgent`)	`AgentResult.stop_reason` now differentiates terminal vs. pausing states. `state["a2a_task_state"]` gives exact A2A state.
Existing users	No breaking changes. Happy-path `completed` behavior is identical. The only visible difference: errors return `failed` status instead of crashing.

Backwards Compatibility

No dependency changes — uses existing a2a-sdk 0.3.26 APIs (TaskUpdater.failed(), .requires_input(), .cancel() were already there, just unused)
No API signature changes — StrandsA2AExecutor, A2AAgent have same constructors and methods
Behavioral change (errors only): Previously, agent errors caused a JSON-RPC error response. Now they cause a proper failed task state. This is spec-correct and strictly better for clients.

Changes

Server-side (`executor.py`)

execute(): Wraps streaming in try/except → updater.failed() on error
_execute_streaming(): Extracts result from stream loop; checks stop_reason == "interrupt" after streaming
_handle_interrupt_result(): New method — builds descriptive message from interrupt list, calls updater.requires_input()
cancel(): Calls agent.cancel() if available, then updater.cancel(). Raises ServerError only if no task or already terminal
Error messages on InternalError: Now includes "No valid content found..." / "Request message is missing..."

Client-side (`a2a_agent.py`)

_is_complete_event(): Recognizes _COMPLETE_STATES (all 6 lifecycle states)
State sets derived from _STATE_TO_STOP_REASON in _converters.py (single source of truth)

Converters (`_converters.py`)

_STATE_TO_STOP_REASON: Canonical mapping of A2A state → Strands stop_reason
_extract_task_state(): Helper to pull TaskState from response tuples
convert_response_to_agent_result(): Uses mapping for stop_reason; adds a2a_task_state to AgentResult.state

Testing

182 tests pass (existing + 22 new)
98% coverage on executor.py (0 missing statements)
New test cases cover:
- Error → failed transition
- Error when task already terminal (graceful handling)
- Single interrupt → input_required
- Multiple interrupts → input_required with all listed
- Cancel with valid task
- Cancel with no task (error)
- Cancel when already terminal (error)
- Cancel with agent.cancel() exception (best-effort)
- Client recognition of all 6 states
- State-to-stop_reason mapping for all states
- _extract_task_state helper

Quality

✅ Lint clean (ruff check)
✅ Format clean (ruff format)
✅ 98% patch coverage (codecov threshold met)
✅ Backwards compatible
✅ No dependency changes
✅ Structured logging throughout (task_id=<%s> | message)

cc @mkmeral

Implements all lifecycle states for the A2A protocol integration: 1. Error mapping → failed state: - Agent exceptions now gracefully transition tasks to TaskState.failed - Error details are communicated via the status message - ServerErrors (setup failures) still propagate as before 2. Cancel support: - cancel() now transitions tasks to TaskState.canceled with a message - Raises UnsupportedOperationError only when no current task exists - Handles already-terminal tasks gracefully 3. input_required via Strands interrupts: - Agent stop_reason='interrupt' maps to TaskState.input_required - Interrupt names and reasons are communicated to the client - Multiple interrupts are listed in the status message 4. Client-side handling of all states: - _is_complete_event recognizes all terminal states (completed, failed, canceled, rejected) and pausing states (input_required, auth_required) - convert_response_to_agent_result maps A2A states to Strands stop_reasons: * completed/failed/canceled/rejected → end_turn * input_required/auth_required → interrupt - AgentResult.state includes a2a_task_state for downstream inspection All changes use the existing a2a-sdk 0.3.26 - no dependency changes needed. Resolves: strands-agents#1371

- Fix B904: add 'from None' to re-raised ServerError in cancel() - Fix F401: remove unused imports (TaskStatus, Part, TextPart, Role) in tests - Fix I001: sort import blocks in test files - Fix ruff formatting in executor.py and test_executor.py

codecov · 2026-05-04T19:29:04Z

Codecov Report

❌ Patch coverage is 98.70130% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/strands/multiagent/a2a/executor.py	98.07%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

agent-of-mkmeral

TL;DR: FAIL — 2 bugs found. Both are TypeError crashes in _converters.py when A2A responses contain None parts lists.

Adversarial Testing Report

Scope: Full A2A lifecycle state support (executor error/interrupt/cancel handling, client-side state recognition, converter state mapping)
Tests written: 32
Tests passing: 30
Tests failing (findings): 2

#	Category	Severity	Description
1	Bug	Medium	`convert_response_to_agent_result` crashes with TypeError when `artifact.parts` is None
2	Bug	Medium	`convert_response_to_agent_result` crashes with TypeError when `status.message.parts` is None

Finding 1 — Bug (Medium): TypeError when artifact.parts is None

Location: src/strands/multiagent/a2a/_converters.py, line 136

Root Cause: The code checks hasattr(update_event.artifact, "parts") but does NOT check if parts is actually not None before iterating. Since hasattr returns True even when the attribute is set to None, this crashes.

Reproduction:

from unittest.mock import MagicMock
from a2a.types import TaskArtifactUpdateEvent
from strands.multiagent.a2a._converters import convert_response_to_agent_result

task = MagicMock()
task.artifacts = None

event = MagicMock(spec=TaskArtifactUpdateEvent)
artifact = MagicMock()
artifact.parts = None  # Has attribute, but it's None
event.artifact = artifact

# CRASH: TypeError: 'NoneType' object is not iterable
result = convert_response_to_agent_result((task, event))

Observed behavior: TypeError: 'NoneType' object is not iterable at line 136
Expected behavior: Should return an empty AgentResult gracefully

Fix:

# Line 135-136: Change from:
if update_event.artifact and hasattr(update_event.artifact, "parts"):
    for part in update_event.artifact.parts:
# To:
if update_event.artifact and hasattr(update_event.artifact, "parts") and update_event.artifact.parts:
    for part in update_event.artifact.parts:

Finding 2 — Bug (Medium): TypeError when status.message.parts is None

Location: src/strands/multiagent/a2a/_converters.py, line 142

Root Cause: The code checks that update_event.status.message is truthy, but does NOT check if message.parts is not None before iterating.

Reproduction:

from unittest.mock import MagicMock
from a2a.types import TaskState, TaskStatusUpdateEvent
from strands.multiagent.a2a._converters import convert_response_to_agent_result

task = MagicMock()
task.artifacts = None

status = MagicMock()
status.state = TaskState.failed
message = MagicMock()
message.parts = None  # Message exists but parts is None
status.message = message

event = MagicMock(spec=TaskStatusUpdateEvent)
event.status = status

# CRASH: TypeError: 'NoneType' object is not iterable
result = convert_response_to_agent_result((task, event))

Observed behavior: TypeError: 'NoneType' object is not iterable at line 142
Expected behavior: Should handle gracefully and fall back to task.artifacts

Fix:

# Line 141-142: Change from:
if update_event.status and hasattr(update_event.status, "message") and update_event.status.message:
    for part in update_event.status.message.parts:
# To:
if update_event.status and hasattr(update_event.status, "message") and update_event.status.message and update_event.status.message.parts:
    for part in update_event.status.message.parts:

What Survived (30 tests passed)

✅ Executor error handling — exceptions correctly transition to failed state
✅ Executor interrupt handling — stop_reason="interrupt" correctly maps to input_required
✅ Cancel support — properly transitions to canceled, handles already-terminal tasks
✅ ServerError re-raise — setup failures propagate correctly (not caught as generic)
✅ Client-side state recognition — all 6 lifecycle states correctly recognized
✅ State-to-stop_reason mapping — all states map correctly to end_turn/interrupt
✅ A2A-compliant streaming + lifecycle — interrupt/error work in new artifact mode
✅ State cleanup on error — _current_artifact_id and _is_first_chunk properly reset
✅ Executor reuse after failure — state is clean for subsequent calls
✅ Backward compatibility — normal completion, max_tokens, empty results all work
✅ Edge cases — empty interrupts, None interrupts, None reasons, multiple results

🤖 AI agent response. Strands Agents. Feedback welcome!

github-actions · 2026-05-04T19:35:02Z

Assessment: Comment

This PR is a well-structured implementation of the A2A task lifecycle states. The architectural decisions (error→failed, interrupt→input_required, cancel support) are sound and align well with the A2A protocol spec.

Review Categories

DRY / Single Source of Truth: State classification knowledge is duplicated between a2a_agent.py (_TERMINAL_STATES/_INPUT_STATES) and _converters.py (_STATE_TO_STOP_REASON). Consider consolidating to avoid divergence.
Security: Raw exception messages are passed to A2A clients in the failed state message, which could leak internal details.
Error Handling: Broad except RuntimeError catches could mask unrelated errors if the A2A SDK behavior changes.
Logging Style: New log messages don't follow the repo's structured logging format (field=<value> | message).
Documentation Accuracy: The cancel() docstring implies the agent execution is stopped, but the implementation only transitions the task state.

Good test coverage with both positive and negative cases, and backwards compatibility is maintained for existing completed state flows.

mkmeral · 2026-05-04T19:37:23Z

+            logger.warning("Cancellation requested but no current task found")
+            raise ServerError(error=UnsupportedOperationError()) from None
+
+        updater = TaskUpdater(event_queue, task.id, task.context_id)


agent has cancel method. where do we actually use it? I don't see it. where do we cancel the current agent run?

Good catch — fixed in the latest push. cancel() now calls self.agent.cancel() if the method exists (cooperative cancellation):

# Attempt to stop the agent if it supports cancellation if hasattr(self.agent, "cancel") and callable(self.agent.cancel): try: self.agent.cancel() except Exception: logger.debug("task_id=<%s> | agent cancel signal failed (non-critical)", task.id)

This is best-effort — if agent.cancel() fails, we still transition the A2A task state to canceled. The agent's current model call may complete, but no new iterations will start.

Docstring updated to reflect this accurately.

nit: isn't the type of self.agent Agent? why do we need ducktyping checks?

Fixes: - mypy error: _STATE_TO_STOP_REASON now typed as dict[TaskState, StopReason] instead of implicit str values (was: 'str' incompatible with Literal type) - Bug: None parts crash in convert_response_to_agent_result (artifact.parts and message.parts checked for None before iteration) - Security: error messages no longer expose raw exception details to clients - Lint: removed unused variable 'e' in outer except clause (F841) Review feedback addressed: - Structured logging: all log messages now use 'task_id=<%s> | message' format - cancel() docstring: accurately describes state-only transition + best-effort agent.cancel() call - cancel() now calls agent.cancel() if method exists (cooperative cancellation) - DRY: added _COMPLETE_STATES = _TERMINAL_STATES | _INPUT_STATES and simplified _is_complete_event to use single set membership test - Type annotation: state dict uses dict[str, str] instead of bare dict - RuntimeError catches: added comments explaining they guard TaskUpdater's terminal state enforcement Coverage improvement: - Added tests for: task-already-terminal error path, agent.cancel() call, agent.cancel() exception handling - 181 tests pass (up from 178)

agent-of-mkmeral · 2026-05-04T19:47:08Z

CI Fix Pushed + All Review Feedback Addressed

Why CI Failed

mypy type error — the _STATE_TO_STOP_REASON dict had implicit str values, but AgentResult.stop_reason expects StopReason (a Literal type):

src/strands/multiagent/a2a/_converters.py:169: error: Argument "stop_reason" to "AgentResult" 
has incompatible type "str"; expected "Literal['cancelled', 'checkpoint', ...]"  [arg-type]

Fix: Explicitly typed the dict as dict[TaskState, StopReason] and imported StopReason from types.event_loop.

What Else Was Fixed

Issue	Fix
mypy `arg-type` error	`_STATE_TO_STOP_REASON: dict[TaskState, StopReason]`
Bug: `artifact.parts` = None crash	Added `and update_event.artifact.parts` guard
Bug: `message.parts` = None crash	Added `and update_event.status.message.parts` guard
Security: raw exception in response	Generic message: `"Agent execution failed"` (details only in logs)
Logging: non-structured format	All messages now use `task_id=<%s> \| message` format
cancel(): no agent stop	Now calls `self.agent.cancel()` if method exists
DRY: duplicate state sets	Added `_COMPLETE_STATES = _TERMINAL_STATES \| _INPUT_STATES`
Type safety: bare `dict`	Changed to `dict[str, str]`
Lint: F841 unused var	Removed `as e` from outer `except Exception`
Coverage: 6 lines missing	Added 3 new tests covering edge cases → 181 tests total

All tests pass locally. CI should be green now. @mkmeral

github-actions · 2026-05-04T19:55:32Z

Assessment: Approve

Great job addressing the previous review feedback. The security fix (generic error messages), cooperative agent.cancel() call, structured logging, accurate docstrings, and _COMPLETE_STATES simplification all look good.

Remaining Minor Items

Comment accuracy: The "Derived from _STATE_TO_STOP_REASON" comment on the state sets is misleading since they're still manually defined (not programmatically derived). Consider fixing the wording or actually deriving them.
One log line: Line 289 doesn't follow the structured format (low priority since there's no task context available).

These are non-blocking. The implementation is solid and well-tested.

agent-of-mkmeral

TL;DR: PASS — All previously-found bugs are fixed. 26 additional tests pass including A2A spec conformance validation against Google ADK and a2a-sdk 0.3.26.

Adversarial Testing Report (Re-run on 451cae9)

Scope: Full A2A lifecycle state support re-tested on latest commit + spec conformance against open source
Tests written: 26 (5 regression + 21 spec conformance)
Tests passing: 26
Tests failing: 0

Previously-Found Bugs: FIXED ✅

#	Bug	Status
1	`artifact.parts` is None crash	Fixed (line 136 now checks `and update_event.artifact.parts`)
2	`status.message.parts` is None crash	Fixed (line 142 now checks `and update_event.status.message.parts`)
3	`task.artifacts[].parts` fallback also had same pattern	Fixed (line 150 now checks `and artifact.parts`)

Spec Conformance Validation

Validated against:

A2A Specification (google/A2A repo, spec 1.0)
a2a-sdk 0.3.26 (reference Python library)
Google ADK (google-adk, Agent Development Kit — the reference A2A executor implementation)

TaskState Alignment (7 tests)

✅ All 9 a2a-sdk TaskState enum values accounted for
✅ Terminal states match spec: {completed, failed, canceled, rejected}
✅ Pausing states match spec: {input_required, auth_required}
✅ Non-complete states (submitted, working, unknown) correctly ignored
✅ All lifecycle states mapped to stop_reasons
✅ Terminal → end_turn, Pausing → interrupt

Executor Behavior vs Google ADK (3 tests)

✅ Error → TaskState.failed (same pattern as Google ADK)
✅ Cancel with no task → ServerError (Strands actually does BETTER than ADK which just raises NotImplementedError)
✅ Cancel with task → TaskState.canceled (Strands implements proper cancel; ADK doesn't)

Client-Side Event Recognition (3 tests)

✅ Stream closes on terminal states (spec: "stream MUST close when task reaches terminal state")
✅ Stream pauses on input_required/auth_required
✅ Stream continues on submitted/working

Converter Output (4 tests)

✅ completed → end_turn with a2a_task_state: "completed"
✅ failed → end_turn with error content preserved
✅ input_required → interrupt with input message
✅ Direct Message response → no state metadata

TaskUpdater API Compatibility (4 tests)

✅ All required methods exist: failed, cancel, requires_input, complete, add_artifact, update_status, new_agent_message
✅ failed(), requires_input(), cancel() all accept message parameter

Previous Adversarial Tests Still Passing (5 tests)

✅ artifact.parts = None no longer crashes
✅ message.parts = None no longer crashes
✅ task.artifacts[].parts = None fallback also protected
✅ _STATE_TO_STOP_REASON properly typed (mypy fix)
✅ All 6 lifecycle states have mappings

Key Findings from Open Source Comparison

Google ADK's cancel() raises NotImplementedError('Cancellation is not supported') — Strands actually provides superior A2A spec compliance by implementing proper cancel support.
Google ADK's input_required uses LongRunningFunctions to detect when an agent needs user input and transitions to input_required — Strands maps this from the Strands interrupt mechanism, which is a clean and correct mapping.
Error handling — Both Google ADK and Strands catch exceptions and transition to TaskState.failed with error messages. Strands additionally handles the edge case where updater.failed() itself raises (task already terminal).

🤖 AI agent response. Strands Agents. Feedback welcome!

Changes: 1. InternalError now includes descriptive messages: - 'No valid content found in request message parts' - 'Request message is missing or has no parts' (was: bare InternalError() with no explanation) 2. State sets derived from _STATE_TO_STOP_REASON (single source of truth): - _TERMINAL_STATES = {s for s, r in _STATE_TO_STOP_REASON.items() if r == 'end_turn'} - _INPUT_STATES = {s for s, r in _STATE_TO_STOP_REASON.items() if r == 'interrupt'} - Removed manual TaskState enum listing (no more duplication) 3. Structured logging on cancel no-task path: - context_id=<%s> | cancel requested but no current task found 4. Coverage: executor.py now 98% (0 missing statements) - Fixed test_execute_error_when_task_already_terminal: patches TaskUpdater - Added test_cancel_raises_when_task_already_terminal: covers RuntimeError path - 182 tests pass

agent-of-mkmeral · 2026-05-04T20:28:18Z

Latest Push (`2c8539f`) — All Review Comments Addressed

What Changed

Issue	Fix
"why mysterious server error?"	`InternalError(message="No valid content found...")` — now tells client what went wrong
"+1 unify these"	State sets now actually derived from `_STATE_TO_STOP_REASON` via set comprehensions
Line 289 logging	Now uses `context_id=<%s> \| cancel requested...`
codecov/patch < threshold	executor.py now 98% coverage (0 missing statements, only branch partials)

Coverage Fix

Previous tests mocked at wrong level (event_queue). New tests properly patch TaskUpdater to raise RuntimeError:

test_execute_error_when_task_already_terminal — patches TaskUpdater.failed() ✅
test_cancel_raises_when_task_already_terminal — patches TaskUpdater.cancel() ✅

182 tests pass. @mkmeral

github-actions · 2026-05-04T20:31:24Z

Assessment: Approve

All previous review feedback has been addressed in the latest push. The implementation is clean, well-tested, and ready to merge.

Verified Fixes

✅ State sets now properly derived from _STATE_TO_STOP_REASON (single source of truth)
✅ Structured logging format used consistently (including context_id for cancel-without-task)
✅ InternalError now includes descriptive messages for setup failures
✅ cancel() calls self.agent.cancel() cooperatively + accurate docstring
✅ Coverage gaps addressed with proper TaskUpdater mocking
✅ TaskState import removed from a2a_agent.py (no longer needed)

No further comments.

agent-of-mkmeral

TL;DR: CONCERNED — Tests provide good coverage for the happy paths but have significant gaps in edge cases that could produce silent failures in production. 2 critical, 3 major, 2 minor findings.

🔍 Devil's Advocate Test Review

Scope: All tests added/modified in PR #2245 (lifecycle state support)
Tests reviewed: 26 new tests across 3 files
Findings: 2 critical, 3 major, 2 minor

Summary Verdict

The tests accurately verify the designed happy paths and state mappings. However, they fail to test the boundaries where the code's behavioral decisions are most questionable: empty interrupt lists silently completing tasks, asyncio.CancelledError leaving tasks in indeterminate states, and an inconsistent assertion in the most important state test. The test suite gives confidence that designed flows work but provides no assurance about unexpected flows.

Critical Findings

Finding 1 — Critical: asyncio.CancelledError leaves task in indeterminate state (UNTESTED)

Test: None (gap)
Source: executor.py lines 99-109 (execute() method)
Claims to verify: "Agent execution errors now gracefully transition the task to the failed state"
Actually verifies: Only RuntimeError (an Exception subclass) — test: test_execute_transitions_to_failed_on_streaming_error
Gap: asyncio.CancelledError is a BaseException, not Exception. The except Exception: block at line 107 does NOT catch it. When an asyncio task is cancelled (e.g., HTTP client timeout, server shutdown, concurrent task group cancellation), CancelledError propagates uncaught, and the A2A task is left in working state forever — it never transitions to failed or canceled.
Risk: In production with async frameworks (uvicorn, starlette), HTTP request cancellation raises CancelledError. The remote client sees the task stuck in working with no terminal state — a zombie task.
What good looks like: Either: (1) catch BaseException and transition to canceled/failed, or (2) explicitly test and document that CancelledError propagation is intentional and the framework handles cleanup.

Proof:

async def mock_stream(content_blocks, **kwargs):
    yield {"data": "partial"}
    raise asyncio.CancelledError()

# Result: CancelledError PROPAGATES. Task stays in working state.

Finding 2 — Critical: stop_reason="interrupt" with empty interrupts silently completes task (UNTESTED)

Test: None (gap)
Source: executor.py line 153: if result.stop_reason == "interrupt" and result.interrupts:
Claims to verify: "When the Strands Agent returns with stop_reason="interrupt", the task transitions to input_required"
Actually verifies: Only the case where result.interrupts has actual Interrupt objects
Gap: If result.interrupts = [] (empty list, which is falsy in Python), the condition evaluates to False and the task transitions to completed instead of input_required. The agent explicitly signaled it needs input, but the A2A task says "completed". No test validates this behavior or proves it's intentional.
Risk: A Strands agent with stop_reason="interrupt" but an empty interrupts list (possible during concurrent modifications or edge cases in interrupt handling) will silently succeed instead of pausing. The calling agent gets end_turn instead of interrupt, losing the signal entirely.
What good looks like: A test that explicitly asserts the behavior for interrupts=[] — either validating it should be treated as completion (with a docstring explaining why) or revealing it as a bug that should transition to input_required.

Proof:

result.stop_reason = "interrupt"
result.interrupts = []  # Falsy!
# Result: Task transitions to `completed`, NOT `input_required`

Major Findings

Finding 3 — Major: test_convert_response_completed_state_maps_to_end_turn missing state assertion

Test: test_converters.py::test_convert_response_completed_state_maps_to_end_turn
Claims to verify: "completed state maps to end_turn stop_reason" (name) + state metadata
Actually verifies: Only assert result.stop_reason == "end_turn" — does NOT assert result.state
Gap: Every other state test (failed, canceled, rejected, input_required, auth_required) asserts BOTH result.stop_reason AND result.state.get("a2a_task_state"). The completed test — arguably the MOST important one since it's the happy path — is the only one missing the state assertion.
Risk: If a regression removes the a2a_task_state: "completed" from the result state, no test catches it. Downstream consumers relying on result.state["a2a_task_state"] would break silently.
What good looks like: Add assert result.state.get("a2a_task_state") == "completed" (one line).

Finding 4 — Major: No test for TaskState.unknown mapping

Test: None (gap)
Source: _converters.py line 122: _STATE_TO_STOP_REASON.get(task_state, "end_turn") if task_state else "end_turn"
Claims to verify: PR description says "proper handling of all A2A task states"
Actually verifies: 6 of 9 TaskState values are tested. unknown, submitted, working are not tested in converters.
Gap: The a2a-sdk has TaskState.unknown. The code handles it via the .get() default (end_turn). But there's no explicit test proving that unknown → end_turn is intentional. If the a2a-sdk adds new states in the future and the default is wrong, there's no guardrail.
Risk: Low-probability but high-impact: if a real A2A server returns unknown state, the client treats it as a successful completion with end_turn. Is that correct? Without a test, nobody knows if this was a deliberate decision or an oversight.
What good looks like: One test: assert _STATE_TO_STOP_REASON.get(TaskState.unknown, "end_turn") == "end_turn" with a docstring explaining the design choice.

Finding 5 — Major: No end-to-end wire format validation between server and client

Test: None (gap)
Source: executor.py (server) + a2a_agent.py (client)
Claims to verify: "A2A task lifecycle" works end-to-end
Actually verifies: Server tests verify events are enqueued correctly. Client tests verify events are recognized correctly. But there's NO test proving the server's output format is what the client expects.
Gap: The server-side test for input_required creates a TaskStatusUpdateEvent via TaskUpdater.requires_input(). The client-side test creates a TaskStatusUpdateEvent via MagicMock(spec=TaskStatusUpdateEvent). These could have different structures (e.g., the final flag, message format, context_id presence). No integration test validates compatibility.
Risk: If TaskUpdater.requires_input() produces a slightly different event shape than what _is_complete_event() checks for (e.g., the event has status=None in some edge case), the client would loop forever waiting for completion.
What good looks like: One integration test that creates a real TaskUpdater, calls .requires_input(), captures the enqueued event, and passes it to _is_complete_event().

Minor Findings

Finding 6 — Minor: test_execute_error_when_task_already_terminal doesn't verify event_queue state

Test: test_executor.py::test_execute_error_when_task_already_terminal
Claims to verify: "error handled gracefully when task already in terminal state"
Actually verifies: mock_updater.failed.assert_called_once() (that the attempt was made)
Gap: Doesn't verify that event_queue received no spurious events. If the code inadvertently enqueues a partial event before the error, this test wouldn't catch it.
What good looks like: Add assert mock_event_queue.enqueue_event.call_count == 1 (only the initial task creation event).

Finding 7 — Minor: State tests could be parameterized (DRY concern)

Tests: All 7 test_is_complete_event_*_state_* tests in test_a2a_agent.py
Issue: Each test is an identical 10-line copy with only the state value changed. Using @pytest.mark.parametrize would reduce 70 lines to ~15 lines and make it trivial to add new states when the a2a-sdk evolves.
Risk: None (correctness is fine). But when a2a-sdk 0.4.0 adds new states, it's easy to forget to add the corresponding test.
What good looks like: @pytest.mark.parametrize("state,expected", [(TaskState.failed, True), (TaskState.working, False), ...])

Missing Test Coverage (Priority Order)

Priority	Scenario	Why It Matters
Critical	`asyncio.CancelledError` during stream	Leaves zombie tasks in production
Critical	`stop_reason="interrupt"` + `interrupts=[]`	Silent data loss of interrupt signal
Major	`TaskState.unknown` converter behavior	Future-proofing against SDK changes
Major	Server→Client wire format round-trip	Integration gap between components
Minor	`_extract_task_state` with `TaskArtifactUpdateEvent` input	Returns None (untested path)

What's Actually Good

The tests that DO exist are well-structured: they use real A2A types (not just strings), verify both the state transition AND the message content, test the error-already-terminal edge case, and verify cooperative cancellation (agent.cancel() called + exception swallowed). The multi-interrupt test is particularly good — it validates that all interrupt names and reasons appear in the status message.

Recommendations

Add asyncio.CancelledError handling in execute() — either except BaseException with proper cancel transition, or document the deliberate propagation. Either way, add a test.
Decide on interrupts=[] behavior — add a test that either asserts completion (documenting the intentional fallthrough) or fix it to treat any stop_reason="interrupt" as input_required.
**Add assert result.state.get("a2a_task_state") == "completed"** to the completed converter test (1-line fix).
Add one TaskState.unknown test to document the default behavior.
Consider one integration test that runs the executor and feeds its output to the client's _is_complete_event.

🤖 AI agent response. Strands Agents. Feedback welcome!

…bugs Devil's Advocate Review Findings Addressed: Critical (2): 1. asyncio.CancelledError now transitions task to 'canceled' state before re-raising. Previously, CancelledError (BaseException, not Exception) would propagate uncaught, leaving the A2A task stuck in 'working' forever (zombie). - Added explicit 'except asyncio.CancelledError' handler in execute() - Transitions to canceled, then re-raises for framework cleanup - Handles edge case where task is already terminal (RuntimeError) 2. stop_reason='interrupt' with empty/None interrupts list no longer silently completes the task. The stop_reason is now the authoritative signal — if the agent says 'interrupt', we transition to input_required regardless of whether the interrupts list is populated. - Removed 'and result.interrupts' from the condition - Added fallback message: 'Agent requires additional input to continue' Major (3): 3. test_convert_response_completed_state now asserts result.state metadata (was the only lifecycle test missing this assertion) 4. Added test for TaskState.unknown → end_turn default behavior 5. Added test_state_to_stop_reason_covers_all_lifecycle_states (guards against future a2a-sdk additions we miss) Minor (2): 6. Added test_extract_task_state_from_artifact_update_returns_none 7. Added parametrized test covering ALL 9 TaskState values for _is_complete_event (replaces verbose individual tests) Code fixes: - cancel(): Removed hasattr/callable duck-typing (nit from review), now uses try/except (AttributeError, NotImplementedError) directly - Added 'import asyncio' to executor.py Tests: 201 pass (was 182)

agent-of-mkmeral · 2026-05-04T20:55:18Z

All Devil's Advocate Findings Addressed

Commit: a66e210
Tests: 201 pass (was 182)

Critical Fixes (Code + Tests)

#	Finding	Fix
1	`asyncio.CancelledError` left tasks as zombies in `working` state	Added `except asyncio.CancelledError` handler → transitions to `canceled` before re-raising
2	`stop_reason='interrupt'` with empty `interrupts=[]` silently completed task	Removed `and result.interrupts` condition — `stop_reason` is now the authoritative signal

Major Fixes (Tests)

#	Finding	Fix
3	Completed state test missing `result.state` assertion	Added `assert result.state.get('a2a_task_state') == 'completed'`
4	No test for `TaskState.unknown` mapping	Added test documenting unknown → end_turn default (intentional design decision)
5	No guard against future a2a-sdk state additions	Added `test_state_to_stop_reason_covers_all_lifecycle_states`

Minor Fixes

#	Finding	Fix
6	`_extract_task_state` with artifact event untested	Added test (returns None)
7	State tests verbose/hard to extend	Added `@pytest.mark.parametrize` covering all 9 TaskState values

Also Fixed (from mkmeral's nit)

Removed duck-typing hasattr/callable check for agent.cancel() — now uses try/except directly

🤖 AI agent response. Strands Agents. Feedback welcome!

github-actions · 2026-05-04T20:59:35Z

+            raise ServerError(error=UnsupportedOperationError()) from None
+
+        # Attempt to cooperatively cancel the agent's execution (best-effort).
+        # Agent.cancel() may not exist on all implementations, so we guard with hasattr.


Issue: The comment says "we guard with hasattr" but the implementation actually uses try/except on AttributeError (which is the right approach per mkmeral's feedback). The comment is a leftover from the previous iteration.

Suggestion: Update the comment to match the implementation:

# Attempt to cooperatively cancel the agent's execution (best-effort). # Catches AttributeError/NotImplementedError if agent doesn't support cancel().

github-actions · 2026-05-04T20:59:36Z

Assessment: Approve

Excellent iteration. The two critical findings (asyncio.CancelledError zombie prevention and empty-interrupts edge case) are real bugs that have been properly fixed and tested. The code is now robust, well-documented, and comprehensive.

New Changes Verified

✅ asyncio.CancelledError handling — transitions to canceled before re-raising (prevents zombie tasks)
✅ stop_reason="interrupt" is now the authoritative signal (empty/None interrupts still trigger input_required)
✅ cancel() uses try/except AttributeError instead of hasattr (per mkmeral's feedback)
✅ Parametrized tests covering all 9 TaskState values
✅ Guard test (test_state_to_stop_reason_covers_all_lifecycle_states) prevents future state drift
✅ TaskState.unknown default behavior documented and tested

One trivial nit: line 318 comment says "we guard with hasattr" but the implementation uses try/except. Non-blocking.

agent-of-mkmeral added 2 commits May 4, 2026 19:16

github-actions Bot added the size/l label May 4, 2026

agent-of-mkmeral temporarily deployed to manual-approval May 4, 2026 19:22 — with GitHub Actions Inactive

agent-of-mkmeral requested a deployment to manual-approval May 4, 2026 19:22 — with GitHub Actions Waiting

github-actions Bot added the strands-running label May 4, 2026

mkmeral self-assigned this May 4, 2026

agent-of-mkmeral commented May 4, 2026

View reviewed changes

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/executor.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/executor.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/executor.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/executor.py

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/agent/a2a_agent.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/agent/a2a_agent.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/_converters.py Outdated

github-actions Bot removed the strands-running label May 4, 2026

mkmeral reviewed May 4, 2026

View reviewed changes

github-actions Bot removed the size/l label May 4, 2026

agent-of-mkmeral temporarily deployed to manual-approval May 4, 2026 19:45 — with GitHub Actions Inactive

agent-of-mkmeral requested a deployment to manual-approval May 4, 2026 19:45 — with GitHub Actions Waiting

github-actions Bot added the size/l label May 4, 2026

github-actions Bot added the strands-running label May 4, 2026

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/agent/a2a_agent.py Outdated

github-actions Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/strands/multiagent/a2a/executor.py Outdated

github-actions Bot removed the strands-running label May 4, 2026

agent-of-mkmeral commented May 4, 2026

View reviewed changes

github-actions Bot added size/l and removed size/l labels May 4, 2026

agent-of-mkmeral requested a deployment to manual-approval May 4, 2026 20:27 — with GitHub Actions Waiting

agent-of-mkmeral temporarily deployed to manual-approval May 4, 2026 20:27 — with GitHub Actions Inactive

github-actions Bot added the strands-running label May 4, 2026

github-actions Bot removed the strands-running label May 4, 2026

agent-of-mkmeral commented May 4, 2026

View reviewed changes

github-actions Bot added size/xl and removed size/l labels May 4, 2026

agent-of-mkmeral requested a deployment to manual-approval May 4, 2026 20:55 — with GitHub Actions Waiting

agent-of-mkmeral temporarily deployed to manual-approval May 4, 2026 20:55 — with GitHub Actions Inactive

github-actions Bot added the strands-running label May 4, 2026

github-actions Bot reviewed May 4, 2026

View reviewed changes

github-actions Bot removed the strands-running label May 4, 2026

Conversation

agent-of-mkmeral commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Actually Changes (Before → After)

1. Agent Errors No Longer Crash the Task

2. Cancellation Now Works

3. Strands Interrupts → A2A input_required

4. Client-Side (A2AAgent) Handles All States

Impact on A2A Tasks

Task State Machine (what this PR enables)

Who Is Affected

Backwards Compatibility

Changes

Server-side (executor.py)

Client-side (a2a_agent.py)

Converters (_converters.py)

Testing

Quality

Uh oh!

codecov Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

agent-of-mkmeral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Uh oh!

mkmeral May 4, 2026

Choose a reason for hiding this comment

Uh oh!

agent-of-mkmeral May 4, 2026

Choose a reason for hiding this comment

Uh oh!

mkmeral May 4, 2026

Choose a reason for hiding this comment

Uh oh!

agent-of-mkmeral commented May 4, 2026

CI Fix Pushed + All Review Feedback Addressed

Why CI Failed

What Else Was Fixed

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

agent-of-mkmeral left a comment

Choose a reason for hiding this comment

Previously-Found Bugs: FIXED ✅

Spec Conformance Validation

Key Findings from Open Source Comparison

Uh oh!

agent-of-mkmeral commented May 4, 2026

Latest Push (2c8539f) — All Review Comments Addressed

What Changed

Coverage Fix

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

agent-of-mkmeral left a comment

Choose a reason for hiding this comment

Summary Verdict

Critical Findings

Major Findings

Minor Findings

Missing Test Coverage (Priority Order)

What's Actually Good

Recommendations

Uh oh!

agent-of-mkmeral commented May 4, 2026

All Devil's Advocate Findings Addressed

agent-of-mkmeral commented May 4, 2026 •

edited

Loading

3. Strands Interrupts → A2A `input_required`

4. Client-Side (`A2AAgent`) Handles All States

Server-side (`executor.py`)

Client-side (`a2a_agent.py`)

Converters (`_converters.py`)

codecov Bot commented May 4, 2026 •

edited

Loading

Latest Push (`2c8539f`) — All Review Comments Addressed