Skip to content

Python: FoundryChatClient crashes with AsyncStreamWrapper has no attribute 'headers' when azure-ai-projects experimental tracing is enabled #6028

@EmilienMottet

Description

@EmilienMottet

Describe the bug

After upgrading from agent-framework-openai==1.5.0 to 1.6.0, every streaming LLM call through FoundryChatClient crashes with:

agent_framework.exceptions.ChatClientException:
  ("<class 'agent_framework_foundry._chat_client.FoundryChatClient'>
    service failed to complete the prompt: 'AsyncStreamWrapper' object has no attribute 'headers'",
   AttributeError("'AsyncStreamWrapper' object has no attribute 'headers'"))

This breaks every workflow and every chat reply for any application running agent-framework>=1.6.0 together with azure-ai-projects experimental GenAI tracing enabled (AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true), which is a documented opt-in for end-to-end OTel spans.

Root cause

PR #5910 (commit 1b6f7d80) introduced four served_model = self._extract_served_model(raw_*_response.headers) reads inside _stream()/_get_response() to surface the x-ms-served-model header:

  • python/packages/openai/agent_framework_openai/_chat_client.py:639raw_stream_response.headers
  • python/packages/openai/agent_framework_openai/_chat_client.py:680raw_create_response.headers
  • python/packages/openai/agent_framework_openai/_chat_client.py:709raw_response.headers (background poll)
  • python/packages/openai/agent_framework_openai/_chat_client.py:731raw_response.headers (non-streaming)

When azure-ai-projects>=2.1.0 experimental tracing is active, azure/ai/projects/telemetry/_responses_instrumentor._async_wrapped_responses_stream wraps the raw streaming response into an AsyncStreamWrapper (defined inline at _responses_instrumentor.py:2929). That wrapper exposes .response / .stream_async_iter but not .headers, so raw_stream_response.headers raises AttributeError.

_extract_served_model (line 739) already tolerates headers=None (it returns None), so the cleanest fix is to read headers defensively with getattr(..., "headers", None).

To reproduce

  1. pip install agent-framework>=1.6.0 azure-ai-projects>=2.1.0
  2. Set AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true (Microsoft-documented opt-in)
  3. Configure Azure Monitor OpenTelemetry so the experimental instrumentor activates
  4. Instantiate a FoundryChatClient against an Azure OpenAI Responses deployment (any GPT model — reproduced with gpt-5.1-dzs and gpt-5.5-dzs)
  5. Call get_streaming_response("hello")

Expected: streamed response.
Actual: immediate ChatClientException wrapping the AttributeError.

Environment

  • agent-framework-core==1.6.0, agent-framework-foundry==1.6.0, agent-framework-openai==1.6.0
  • azure-ai-projects==2.1.0
  • Python 3.12 on Linux
  • Reproduced both via FoundryChatClient.get_streaming_response() and via the agent loop (workflow.run(..., stream=True) / agent.run(..., stream=True))

Proposed fix

Use getattr(raw_*_response, "headers", None) for all four call sites in agent_framework_openai/_chat_client.py. _extract_served_model already handles None, so the served-model feature degrades gracefully (no header surfaced) instead of crashing when the response is wrapped by an instrumentor that doesn't proxy .headers. PR follows.

Workaround for users

Set AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=false until the fix lands.

Metadata

Metadata

Labels

agentsIssues related to single agentsbugSomething isn't workingpythonreproduced

Type

No fields configured for Bug.

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions