Skip to content

feat(openai-agents): GenAI semconv compliance#3837

Open
max-deygin-traceloop wants to merge 8 commits intomainfrom
max/tlp-1928-openai-agents-insturmentation
Open

feat(openai-agents): GenAI semconv compliance#3837
max-deygin-traceloop wants to merge 8 commits intomainfrom
max/tlp-1928-openai-agents-insturmentation

Conversation

@max-deygin-traceloop
Copy link
Contributor

@max-deygin-traceloop max-deygin-traceloop commented Mar 22, 2026

Migrate openai-agents instrumentation to OTel GenAI semconv

Aligns the OpenAI Agents SDK instrumentation with the upstream OpenTelemetry GenAI semantic conventions.

Changes:

  • Replace all SpanAttributes.LLM_* constants with upstream GenAIAttributes.GEN_AI_* equivalents
  • Migrate flat gen_ai.prompt.{i}.* / gen_ai.completion.{i}.* attributes to gen_ai.input.messages / gen_ai.output.messages JSON arrays
  • Migrate flat gen_ai.tool.definitions.{i}.* attributes to gen_ai.tool.definitions JSON array
  • Fix 5 duplicate gen_ai.operation.name dict keys in _hooks.py
  • Fix duplicate gen_ai.system key in _realtime_wrappers.py
  • Update all tests to use new attribute names and JSON array format

Note: This is a recreation of #3823, which was auto-closed when its target branch was merged into main. No code changes were made.

Summary by CodeRabbit

  • Refactor

    • Telemetry now records conversation inputs/outputs as JSON arrays (gen_ai.input.messages, gen_ai.output.messages) instead of many per-message attributes.
    • Tool definitions and usage totals are consolidated into single JSON attributes and updated semantic names (gen_ai.tool_definitions, gen_ai.usage.*, gen_ai.operation.name).
  • Tests

    • Tests updated to validate the new JSON-encoded attributes and revised semantic convention names.
  • Chores

    • Bumped semantic conventions dependency to v0.5.0+.

max-deygin-traceloop and others added 7 commits March 22, 2026 10:21
…LM_ → GEN_AI_

- Replace SpanAttributes.LLM_REQUEST_TYPE with GenAIAttributes.GEN_AI_OPERATION_NAME
- Replace SpanAttributes.LLM_REQUEST_FUNCTIONS with GenAIAttributes.GEN_AI_TOOL_DEFINITIONS
- Replace SpanAttributes.LLM_SYSTEM with GenAIAttributes.GEN_AI_SYSTEM
- Replace SpanAttributes.LLM_USAGE_TOTAL_TOKENS with SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS
- Add test_semconv_compliance.py and [tool.uv.sources] for local semconv_ai

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l definitions to JSON array

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…en test constants, remove duplicate GEN_AI_SYSTEM key

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssage format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ve orphaned completion.tool.* attrs

- _realtime_wrappers.py: replace flat gen_ai.prompt.*/gen_ai.completion.* in
  create_llm_span() with GEN_AI_INPUT_MESSAGES/GEN_AI_OUTPUT_MESSAGES JSON arrays
- _hooks.py: remove gen_ai.completion.tool.{name,type,strict_json_schema} sub-attributes
  from FunctionSpanData handler (tool name already captured via GEN_AI_TOOL_NAME)
- tests: update test_realtime_session.py assertions to parse JSON array attributes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, remove AgentSpanData misfeature

- _hooks.py: remove misplaced catch-all 'elif span_data:' that was shadowing
  SpeechSpanData, TranscriptionSpanData, SpeechGroupSpanData, and AgentSpanData branches
- _hooks.py: remove AgentSpanData handler that incorrectly propagated model settings
  to agent spans (test spec: agent spans must NOT carry gen_ai.request.* params)
- _hooks.py: replace hardcoded "openai.agent.model.frequency_penalty" with
  GenAIAttributes.GEN_AI_REQUEST_FREQUENCY_PENALTY constant
- tests: fix dead "llm.usage.*" prefix check, fix vestigial "gen_ai.prompt" scan,
  fix hardcoded frequency_penalty string, fix long line in test_realtime_session.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

Prompt and response attributes were consolidated: per-message indexed attributes (e.g., gen_ai.prompt.0.*, gen_ai.completion.0.*) are replaced by JSON-encoded message arrays stored in gen_ai.input.messages and gen_ai.output.messages. Span attribute names were updated to newer GenAI semconv fields; tests and dependency constraints adjusted accordingly.

Changes

Cohort / File(s) Summary
Core Implementation
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py, packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_realtime_wrappers.py
Reworked prompt/response extraction to emit single JSON-encoded arrays (gen_ai.input.messages, gen_ai.output.messages) instead of per-message indexed attributes; removed some tool metadata attributes and legacy fallbacks; switched request-type/usage attribute names to GenAI semconv; added json usage for serialization.
Dependencies
packages/opentelemetry-instrumentation-openai-agents/pyproject.toml
Bumped opentelemetry-semantic-conventions-ai constraint from >=0.4.13,<0.5.0 to >=0.5.0,<0.6.0.
Tests
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py, .../tests/test_realtime.py, .../tests/test_realtime_session.py, .../tests/test_recipe_agents_hierarchy.py
Updated tests to parse and assert JSON-encoded gen_ai.input.messages / gen_ai.output.messages instead of indexed prompt/completion attributes; removed assertions for SpanAttributes.LLM_REQUEST_TYPE; added json imports and tightened usage attribute checks to gen_ai.usage.*.
New Compliance Test
packages/opentelemetry-instrumentation-openai-agents/tests/test_semconv_compliance.py
Added test module that imports semantic-convention test symbols from opentelemetry.semconv_ai._testing to validate installed semconv constants.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hop, hop, messages bundled tight,

JSON arrays gleam in telemetry light,
No more indices scattered wide,
One tidy stream where prompts reside,
A rabbit cheers the cleaner stride! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title accurately describes the main change: migrating OpenAI Agents instrumentation to comply with upstream OpenTelemetry GenAI semantic conventions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch max/tlp-1928-openai-agents-insturmentation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai-agents/tests/test_semconv_compliance.py (1)

1-8: Consider extending _testing.py to cover new message/tool attributes.

The shared compliance tests in opentelemetry.semconv_ai._testing (per the context snippet at packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/_testing.py:1-56) currently validate renamed GEN_AI_* constants but do not include tests for:

  • GenAIAttributes.GEN_AI_INPUT_MESSAGES
  • GenAIAttributes.GEN_AI_OUTPUT_MESSAGES
  • GenAIAttributes.GEN_AI_TOOL_DEFINITIONS

These constants are imported from the upstream opentelemetry.semconv._incubating.attributes.gen_ai_attributes module and used heavily throughout _hooks.py and _realtime_wrappers.py. While the import will fail if the constants don't exist, explicit compliance tests would provide better validation and documentation.

Would you like me to draft compliance tests for these new upstream constants in _testing.py?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_semconv_compliance.py`
around lines 1 - 8, Add explicit compliance assertions in the shared test module
_testing.py to verify the presence and expected values of the three new GenAI
constants: GenAIAttributes.GEN_AI_INPUT_MESSAGES,
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, and
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS; import these from
opentelemetry.semconv._incubating.attributes.gen_ai_attributes (same source used
by _hooks.py and _realtime_wrappers.py) and add simple equality/assertion checks
that the exported constants in opentelemetry.semconv_ai match the upstream
constants so the tests will fail if those names or values are missing or
renamed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_semconv_compliance.py`:
- Around line 1-8: Add explicit compliance assertions in the shared test module
_testing.py to verify the presence and expected values of the three new GenAI
constants: GenAIAttributes.GEN_AI_INPUT_MESSAGES,
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, and
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS; import these from
opentelemetry.semconv._incubating.attributes.gen_ai_attributes (same source used
by _hooks.py and _realtime_wrappers.py) and add simple equality/assertion checks
that the exported constants in opentelemetry.semconv_ai match the upstream
constants so the tests will fail if those names or values are missing or
renamed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 55f0bea4-7b6a-434a-89a3-58e2c26ee263

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2418b and af07493.

⛔ Files ignored due to path filters (1)
  • packages/opentelemetry-instrumentation-openai-agents/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_realtime_wrappers.py
  • packages/opentelemetry-instrumentation-openai-agents/pyproject.toml
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_realtime.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_realtime_session.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_recipe_agents_hierarchy.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_semconv_compliance.py
💤 Files with no reviewable changes (1)
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_realtime.py

SpanAttributes.LLM_REQUEST_TYPE: "response",
GenAIAttributes.GEN_AI_SYSTEM: "openai",
GenAIAttributes.GEN_AI_OPERATION_NAME: "response",
GenAIAttributes.GEN_AI_SYSTEM: "openai",
Copy link
Member

@doronkopit5 doronkopit5 Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recalling @avivhalfon comments, shouldn't this be provider?

Copy link
Member

@doronkopit5 doronkopit5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you succeed testing sending traces with this instrumentation using a sample-app?

No longer needed since 0.5.0 is published on PyPI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@doronkopit5 doronkopit5 changed the title feat(openai-agents) GenAI semconv compliance feat(openai-agents): GenAI semconv compliance Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants