feat(open-ai): instrumentation to support OTel GenAI Semantic Conventions 0.5.0#3844
feat(open-ai): instrumentation to support OTel GenAI Semantic Conventions 0.5.0#3844max-deygin-traceloop wants to merge 15 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSwitches many span semantic keys from LLM to GenAI conventions, adds Config.use_messages_attributes and an OpenAIInstrumentor parameter to optionally serialize inputs/outputs as structured gen_ai.input.messages/gen_ai.output.messages, updates wrappers/models/tests accordingly, and bumps semantic-conventions dependency ranges. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py (1)
626-630:⚠️ Potential issue | 🟡 MinorConfusing variable names in token usage assertion.
The variable names don't match the attributes being retrieved:
completion_tokens←GEN_AI_USAGE_TOTAL_TOKENSprompt_tokens←GEN_AI_USAGE_INPUT_TOKENStotal_tokens←GEN_AI_USAGE_OUTPUT_TOKENSThe assertion
completion_tokens + prompt_tokens == total_tokenseffectively checkstotal + input == output, which is mathematically incorrect. This appears to be a pre-existing issue, but worth fixing for clarity.♻️ Suggested fix
- completion_tokens = open_ai_span.attributes.get(SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS) - prompt_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS) - total_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) - if completion_tokens and prompt_tokens and total_tokens: - assert completion_tokens + prompt_tokens == total_tokens + input_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS) + output_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) + total_tokens = open_ai_span.attributes.get(SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS) + if input_tokens and output_tokens and total_tokens: + assert input_tokens + output_tokens == total_tokens🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py` around lines 626 - 630, The three variables are misnamed and the assertion is checking the wrong relationship: change the variable names so they match the attributes (e.g., total_tokens = open_ai_span.attributes.get(SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS), input_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS), output_tokens = open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS)) and update the assertion to assert input_tokens + output_tokens == total_tokens (keeping the existing guard that all three are truthy).
🧹 Nitpick comments (4)
packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py (1)
94-109: Also prove the legacy prompt/completion attributes stay off.This test only checks that
gen_ai.input.messagesandgen_ai.output.messagesexist. It would still pass if the span emitted both the new payload and the oldgen_ai.prompt.*/gen_ai.completion.*keys, which is the regression this flag is most likely to introduce.Proposed assertion additions
assert output_messages[0]["finish_reason"] == "stop" assert len(output_messages[0]["parts"]) >= 1 assert output_messages[0]["parts"][0]["type"] == "text" assert "content" in output_messages[0]["parts"][0] + assert f"{GenAIAttributes.GEN_AI_PROMPT}.0.content" not in open_ai_span.attributes + assert f"{GenAIAttributes.GEN_AI_COMPLETION}.0.content" not in open_ai_span.attributes🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py` around lines 94 - 109, Add assertions to ensure legacy prompt/completion attributes are not present on the span: check the span attributes (open_ai_span.attributes) do not contain any keys starting with "gen_ai.prompt." or "gen_ai.completion." (for example by asserting not any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k in open_ai_span.attributes.keys())). Keep these checks near the existing input/output message validations so the test fails if legacy fields are emitted alongside the new gen_ai.input.messages/gen_ai.output.messages attributes.packages/opentelemetry-instrumentation-openai/tests/conftest.py (1)
191-200: Restore the prior config value in teardown.This fixture mutates a process-global singleton and always tears it back down to
False. Capturing the previous value makes it compose safely with any future override and avoids silent state bleed if the default changes.Proposed fix
`@pytest.fixture`(scope="function") def instrument_with_messages_attributes( instrument_legacy, reader, tracer_provider, logger_provider, meter_provider ): instrumentor = instrument_legacy + previous = Config.use_messages_attributes Config.use_messages_attributes = True yield instrumentor - Config.use_messages_attributes = False + Config.use_messages_attributes = previous🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/conftest.py` around lines 191 - 200, The fixture instrument_with_messages_attributes currently sets Config.use_messages_attributes = True and always resets it to False; instead capture the previous value before mutating (prev = Config.use_messages_attributes) and after yield restore it (Config.use_messages_attributes = prev) so the fixture composes safely with other tests and preserves prior global state; locate this logic in the instrument_with_messages_attributes fixture and replace the unconditional teardown with restoring the saved prev value.packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py (2)
103-117: Consider strengthening assertions for None content tests.The tests verify that
parts is not None, but it would be more explicit to verify the expected behavior. Based on the implementation in_set_input_messages, whencontent=None, the parts will be an empty list[].💡 Optional: More explicit assertions
def test_user_with_none_content(self, mock_span): """User message with None content should not crash or produce null parts.""" _set_input_messages(mock_span, [{"role": "user", "content": None}]) result = _get_input_messages(mock_span) assert len(result) == 1 - assert result[0]["parts"] is not None + assert result[0]["parts"] == [] def test_system_with_none_content(self, mock_span): """System message with None content should not crash or produce null parts.""" _set_input_messages(mock_span, [{"role": "system", "content": None}]) result = _get_input_messages(mock_span) assert len(result) == 1 - assert result[0]["parts"] is not None + assert result[0]["parts"] == []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py` around lines 103 - 117, Change the two tests test_user_with_none_content and test_system_with_none_content to assert the explicit expected value for parts (an empty list) rather than only checking that parts is not None; locate the assertions after calling _set_input_messages and _get_input_messages and replace the "parts is not None" checks with an equality check against [] to reflect the implementation of _set_input_messages.
211-217: Test assertion is incomplete.The test verifies the function doesn't crash, but doesn't verify the expected output. Based on the implementation, when the only choice has
message=None, the function will setGEN_AI_OUTPUT_MESSAGESto an empty JSON array"[]".💡 Optional: Add explicit assertion
def test_none_message_in_choice(self, mock_span): """Choice with message=None should not crash.""" choices = [ {"index": 0, "message": None, "finish_reason": "content_filter"}, ] _set_output_messages(mock_span, choices) - # Should not crash; attribute may or may not be set + # Should not crash; attribute should be set to empty array + result = _get_output_messages(mock_span) + assert result == []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py` around lines 211 - 217, The test should assert the expected attribute value when the only choice has message=None: after calling _set_output_messages(mock_span, choices) verify that mock_span.set_attribute was called with the GEN_AI_OUTPUT_MESSAGES key and a value representing an empty JSON array (i.e., "[]"); update the test_none_message_in_choice test to check mock_span.set_attribute(GEN_AI_OUTPUT_MESSAGES, "[]") (or equivalent call recorded on mock_span) so the behavior is explicitly validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py`:
- Around line 135-141: Don't serialize raw caller metadata into spans: remove
direct calls that stringify kwargs.get("user"), kwargs.get("headers"), and
kwargs.get("extra_headers") and instead validate and sanitize or omit them
before setting attributes; update uses of _set_span_attribute(span,
SpanAttributes.GEN_AI_USER, ...) and GEN_AI_HEADERS to first check for presence
(avoid recording the literal "None"), apply a sanitizer/allowlist that strips
PII/tokens (or only pick specific safe keys), and only call _set_span_attribute
with the sanitized or explicitly allowed value (or skip setting the attribute
entirely if nothing safe remains).
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py`:
- Around line 169-179: The _set_input_messages function currently drops
empty-string prompts and collapses list prompts to the first element; update it
so it preserves list-shaped prompts and allows empty-string items: remove the
truthiness check that returns on empty prompt, treat prompt uniformly (if it's a
list use it as-is, otherwise wrap non-list prompts into a single-item list),
then JSON-serialize an array mapping each item to
{"role":"user","parts":[{"content": item, "type":"text"}]} and pass that to
_set_span_attribute with GenAIAttributes.GEN_AI_INPUT_MESSAGES so batched
prompts and token arrays are fully recorded.
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.py`:
- Around line 5-7: The TypedDict _FunctionToolCall currently declares arguments:
Optional[dict[str, Any]] but the instrumentation emits/passes a raw JSON string;
update the type to match the emitted payload by changing the annotation on
arguments in _FunctionToolCall to Optional[str] (or, if you intentionally
support both parsed and raw forms, make it explicit as Optional[Union[str,
dict[str, Any]]]) so type-checkers and downstream callers see the correct shape;
ensure the change targets the _FunctionToolCall declaration and any related type
imports (e.g., add typing.Union if chosen).
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py`:
- Around line 123-126: Replace the incorrect use of
GenAIAttributes.GEN_AI_OPERATION_NAME for modalities and the non-standard
operation value "realtime": update the call that sets
f"{GenAIAttributes.GEN_AI_OPERATION_NAME}.modalities" (within the
_set_span_attribute call that references session.modalities) to instead set the
semantic attribute for output modality (use gen_ai.output.type) with one of the
well-known values like "text", "json", or "speech"; and ensure any span
attribute assignment that sets GenAIAttributes.GEN_AI_OPERATION_NAME (search for
usages that assign "realtime") uses one of the allowed operation names ("chat",
"create_agent", "embeddings", "execute_tool", "generate_content",
"invoke_agent", "retrieval", or "text_completion") or is replaced with a
justified custom attribute name if a non-standard operation is truly required.
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yaml`:
- Around line 78-105: The cassette test_chat_with_messages_attributes.yaml
contains sensitive response headers (openai-organization, openai-project,
set-cookie, x-request-id) that weren't scrubbed; update the VCR scrubber used
for this suite by adding response filtering in before_record (or extending
filter_headers) to remove or replace those header keys (openai-organization,
openai-project, set-cookie, x-request-id) from recorded responses, then re-run
the tests to re-record test_chat_with_messages_attributes.yaml so the cassette
no longer contains those sensitive values.
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yaml`:
- Around line 33-34: The cassette contains account-scoped headers (traceparent,
openai-organization, openai-project, set-cookie, x-request-id) that must be
scrubbed before committing; add VCR header filters or a
before_record/before_record_response hook to replace those header values with
deterministic placeholders (e.g., <TRACEPARENT>, <ORG>, <PROJECT>, <SET_COOKIE>,
<X_REQUEST_ID>), update any matching body fields if present, then re-record the
cassette (the YAML under
test_completions/test_completion_with_messages_attributes.yaml and other
affected cassettes referenced in lines ~79-110) so the committed cassette
contains only scrubbed placeholders.
---
Outside diff comments:
In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py`:
- Around line 626-630: The three variables are misnamed and the assertion is
checking the wrong relationship: change the variable names so they match the
attributes (e.g., total_tokens =
open_ai_span.attributes.get(SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS),
input_tokens =
open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS),
output_tokens =
open_ai_span.attributes.get(GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS)) and
update the assertion to assert input_tokens + output_tokens == total_tokens
(keeping the existing guard that all three are truthy).
---
Nitpick comments:
In `@packages/opentelemetry-instrumentation-openai/tests/conftest.py`:
- Around line 191-200: The fixture instrument_with_messages_attributes currently
sets Config.use_messages_attributes = True and always resets it to False;
instead capture the previous value before mutating (prev =
Config.use_messages_attributes) and after yield restore it
(Config.use_messages_attributes = prev) so the fixture composes safely with
other tests and preserves prior global state; locate this logic in the
instrument_with_messages_attributes fixture and replace the unconditional
teardown with restoring the saved prev value.
In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py`:
- Around line 94-109: Add assertions to ensure legacy prompt/completion
attributes are not present on the span: check the span attributes
(open_ai_span.attributes) do not contain any keys starting with "gen_ai.prompt."
or "gen_ai.completion." (for example by asserting not
any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k
in open_ai_span.attributes.keys())). Keep these checks near the existing
input/output message validations so the test fails if legacy fields are emitted
alongside the new gen_ai.input.messages/gen_ai.output.messages attributes.
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py`:
- Around line 103-117: Change the two tests test_user_with_none_content and
test_system_with_none_content to assert the explicit expected value for parts
(an empty list) rather than only checking that parts is not None; locate the
assertions after calling _set_input_messages and _get_input_messages and replace
the "parts is not None" checks with an equality check against [] to reflect the
implementation of _set_input_messages.
- Around line 211-217: The test should assert the expected attribute value when
the only choice has message=None: after calling _set_output_messages(mock_span,
choices) verify that mock_span.set_attribute was called with the
GEN_AI_OUTPUT_MESSAGES key and a value representing an empty JSON array (i.e.,
"[]"); update the test_none_message_in_choice test to check
mock_span.set_attribute(GEN_AI_OUTPUT_MESSAGES, "[]") (or equivalent call
recorded on mock_span) so the behavior is explicitly validated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a10d3c9a-1341-48bb-a2ef-ae05c4745cff
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-openai/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (26)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/__init__.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/config.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/embeddings_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.pypackages/opentelemetry-instrumentation-openai/pyproject.tomlpackages/opentelemetry-instrumentation-openai/tests/conftest.pypackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/test_assistant.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_azure.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_chat.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_chat_parse.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_completions.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_embeddings.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_functions.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_responses.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_streaming_with_api_usage.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_vision.py
...opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py
Outdated
Show resolved
Hide resolved
...ry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
Show resolved
Hide resolved
...telemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.py
Outdated
Show resolved
Hide resolved
...elemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
Outdated
Show resolved
Hide resolved
...trumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yaml
Outdated
Show resolved
Hide resolved
...openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yaml
Show resolved
Hide resolved
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (1)
80-85:⚠️ Potential issue | 🟡 MinorSkip tool
parameterswhen the schema is absent.
json.dumps(function.get("parameters"))turns a missing schema into the literal string"null", so spans claim a parameters payload even when none was provided. Guard onparameters is not Nonebefore setting the attribute.💡 Suggested fix
for i, function in enumerate(functions): prefix = f"{GenAIAttributes.GEN_AI_TOOL_DEFINITIONS}.{i}" _set_span_attribute(span, f"{prefix}.name", function.get("name")) _set_span_attribute(span, f"{prefix}.description", function.get("description")) - _set_span_attribute( - span, f"{prefix}.parameters", json.dumps(function.get("parameters")) - ) + parameters = function.get("parameters") + if parameters is not None: + _set_span_attribute(span, f"{prefix}.parameters", json.dumps(parameters)) @@ for i, tool in enumerate(tools): function = tool.get("function") if not function: continue prefix = f"{GenAIAttributes.GEN_AI_TOOL_DEFINITIONS}.{i}" _set_span_attribute(span, f"{prefix}.name", function.get("name")) _set_span_attribute(span, f"{prefix}.description", function.get("description")) - _set_span_attribute( - span, f"{prefix}.parameters", json.dumps(function.get("parameters")) - ) + parameters = function.get("parameters") + if parameters is not None: + _set_span_attribute(span, f"{prefix}.parameters", json.dumps(parameters))Also applies to: 97-102
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py` around lines 80 - 85, The code is unconditionally serializing function.get("parameters") which turns a missing schema into the string "null"; update the logic in the block that builds tool definition span attributes (the code using GenAIAttributes.GEN_AI_TOOL_DEFINITIONS, _set_span_attribute and the local variable function) to first check whether function.get("parameters") is not None and only call _set_span_attribute for the ".parameters" key when parameters is present; apply the same guard to the other identical block around the 97-102 region so you don't record a "null" parameters payload in spans.
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (1)
169-183:⚠️ Potential issue | 🟠 MajorHandle non-string list prompts as one completion input.
The completions client still allows
promptto be a string, an array of strings, an array of tokens, or an array of token arrays. This branch treats every list as “one message per element”, so tokenized prompts get split across multiplegen_ai.input.messagesentries and nested token arrays end up as list-valuedcontent. Restrict the multi-message path tolist[str]and serialize other non-string shapes as a single message. (raw.githubusercontent.com)💡 Suggested fix
def _set_input_messages(span, prompt): if not span.is_recording() or prompt is None: return - prompts = prompt if isinstance(prompt, list) else [prompt] + prompts = ( + prompt + if isinstance(prompt, list) and all(isinstance(p, str) for p in prompt) + else [prompt] + ) messages = [ - {"role": "user", "parts": [{"content": p, "type": "text"}]} + { + "role": "user", + "parts": [ + { + "content": p if isinstance(p, str) else json.dumps(p), + "type": "text", + } + ], + } for p in prompts ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py` around lines 169 - 183, The _set_input_messages function currently treats any list prompt as multiple string messages which splits token lists; update it so only a list of strings (list[str]) is expanded into multiple messages, and for any other prompt shape (e.g., list of tokens or nested lists) serialize the entire prompt as a single message content; modify the branch in _set_input_messages (referencing the function name and GenAIAttributes.GEN_AI_INPUT_MESSAGES and _set_span_attribute) to detect isinstance(prompt, list) and all(isinstance(p, str) for p in prompt) before creating multiple {"role":"user","parts":[{"content":..., "type":"text"}]} entries, otherwise create a single message with the serialized prompt value.packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (1)
135-138:⚠️ Potential issue | 🟠 MajorDon't emit raw
userand header values into spans.OpenAI documents
useras an end-user identifier andextra_headersas arbitrary request headers. Recording both verbatim here turns stable identifiers and possible auth/session data into durable telemetry; sanitize/allowlist them or drop them entirely before calling_set_span_attribute. (raw.githubusercontent.com)
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py`:
- Around line 80-85: The code is unconditionally serializing
function.get("parameters") which turns a missing schema into the string "null";
update the logic in the block that builds tool definition span attributes (the
code using GenAIAttributes.GEN_AI_TOOL_DEFINITIONS, _set_span_attribute and the
local variable function) to first check whether function.get("parameters") is
not None and only call _set_span_attribute for the ".parameters" key when
parameters is present; apply the same guard to the other identical block around
the 97-102 region so you don't record a "null" parameters payload in spans.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py`:
- Around line 169-183: The _set_input_messages function currently treats any
list prompt as multiple string messages which splits token lists; update it so
only a list of strings (list[str]) is expanded into multiple messages, and for
any other prompt shape (e.g., list of tokens or nested lists) serialize the
entire prompt as a single message content; modify the branch in
_set_input_messages (referencing the function name and
GenAIAttributes.GEN_AI_INPUT_MESSAGES and _set_span_attribute) to detect
isinstance(prompt, list) and all(isinstance(p, str) for p in prompt) before
creating multiple {"role":"user","parts":[{"content":..., "type":"text"}]}
entries, otherwise create a single message with the serialized prompt value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f3419d11-cacd-4bd5-a8c7-e9f12d1ed800
📒 Files selected for processing (12)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.pypackages/opentelemetry-instrumentation-openai/tests/conftest.pypackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/conftest.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_azure.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_chat.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py
✅ Files skipped from review due to trivial changes (3)
- packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py
- packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yaml
- packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py (1)
292-300: MakeNoneexplicit in request-attribute tests.Line 294 and Line 300 currently validate “missing key” behavior, while the docstrings describe explicit
None. Passingheaders=None/user=Nonedirectly would better lock intent and future-proof these tests.Suggested refinement
class TestRequestAttributesSerialization: def test_none_headers_not_recorded_as_string(self, mock_span): """kwargs with headers=None should not record literal 'None'.""" - _set_request_attributes(mock_span, {"model": "gpt-4"}) + _set_request_attributes(mock_span, {"model": "gpt-4", "headers": None}) val = mock_span._attrs.get(SpanAttributes.GEN_AI_HEADERS) assert val != "None", "Should not record literal string 'None'" def test_none_user_not_recorded(self, mock_span): """kwargs with user=None should not record attribute.""" - _set_request_attributes(mock_span, {"model": "gpt-4"}) + _set_request_attributes(mock_span, {"model": "gpt-4", "user": None}) assert SpanAttributes.GEN_AI_USER not in mock_span._attrs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py` around lines 292 - 300, The tests claim to check explicit None values but currently omit them; update test_none_headers_not_recorded_as_string to call _set_request_attributes(mock_span, {"model": "gpt-4"}, headers=None) and assert that SpanAttributes.GEN_AI_HEADERS is either absent or None (not the literal string "None"), and update test_none_user_not_recorded to call _set_request_attributes(mock_span, {"model": "gpt-4"}, user=None) and assert that the user attribute is not present (or is None) in mock_span._attrs; use the existing test function names (test_none_headers_not_recorded_as_string, test_none_user_not_recorded) and the helper _set_request_attributes/SpanAttributes symbols to locate and change the calls and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py`:
- Around line 292-300: The tests claim to check explicit None values but
currently omit them; update test_none_headers_not_recorded_as_string to call
_set_request_attributes(mock_span, {"model": "gpt-4"}, headers=None) and assert
that SpanAttributes.GEN_AI_HEADERS is either absent or None (not the literal
string "None"), and update test_none_user_not_recorded to call
_set_request_attributes(mock_span, {"model": "gpt-4"}, user=None) and assert
that the user attribute is not present (or is None) in mock_span._attrs; use the
existing test function names (test_none_headers_not_recorded_as_string,
test_none_user_not_recorded) and the helper
_set_request_attributes/SpanAttributes symbols to locate and change the calls
and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bf095afb-423d-45a1-9f9e-4de2c2e07909
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.py
c2b8059 to
33f2b26
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py (1)
261-271:⚠️ Potential issue | 🟠 MajorHandle dict-shaped
usagebefore dereferencing token fields.
traced_response.usagecan be a plain dict here: the fallback type alias isDict[str, Any], and Lines 529, 595, 695, and 757 also rehydrate saved state throughmodel_dump(). This block still uses attribute access forinput_tokens,output_tokens,total_tokens, andinput_tokens_details, so those paths can raise here and skip span enrichment.🐛 Proposed fix
_set_span_attribute(span, OpenAIAttributes.OPENAI_RESPONSE_SERVICE_TIER, traced_response.response_service_tier) if usage := traced_response.usage: - _set_span_attribute(span, GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS, usage.input_tokens) - _set_span_attribute(span, GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS, usage.output_tokens) - _set_span_attribute( - span, SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, usage.total_tokens - ) - if usage.input_tokens_details: + input_tokens = usage.get("input_tokens") if isinstance(usage, dict) else getattr(usage, "input_tokens", None) + output_tokens = usage.get("output_tokens") if isinstance(usage, dict) else getattr(usage, "output_tokens", None) + total_tokens = usage.get("total_tokens") if isinstance(usage, dict) else getattr(usage, "total_tokens", None) + input_tokens_details = ( + usage.get("input_tokens_details") if isinstance(usage, dict) + else getattr(usage, "input_tokens_details", None) + ) + + _set_span_attribute(span, GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS, input_tokens) + _set_span_attribute(span, GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS, output_tokens) + _set_span_attribute(span, SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, total_tokens) + if input_tokens_details: _set_span_attribute( span, SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS, - usage.input_tokens_details.cached_tokens, + input_tokens_details.get("cached_tokens") if isinstance(input_tokens_details, dict) + else getattr(input_tokens_details, "cached_tokens", None), )Also applies to: 275-289
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py` around lines 261 - 271, traced_response.usage may be a dict instead of an object with attributes, so update the span-enrichment code (the block that reads traced_response.usage and calls _set_span_attribute with GenAIAttributes/SpanAttributes) to handle both dict and attribute-style objects: detect with isinstance(usage, dict) and read tokens via usage.get("input_tokens"), usage.get("output_tokens"), usage.get("total_tokens"), and nested usage.get("input_tokens_details", {}).get("cached_tokens") when dict; otherwise use the existing attribute access (usage.input_tokens, usage.output_tokens, usage.total_tokens, usage.input_tokens_details.cached_tokens). Apply the same fix to the analogous usage-handling block later in the file (the second occurrence that currently uses attribute access).
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (1)
135-138:⚠️ Potential issue | 🟠 MajorDon't persist raw caller metadata into spans.
The
Noneguard helps, but Line 135 and Lines 136-138 still copy caller-provided user identifiers and full header dictionaries verbatim into telemetry. That can durably leak auth tokens, cookies, tenant IDs, or other PII into tracing backends. Please drop these attributes or emit only an allowlisted/sanitized subset.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py` around lines 135 - 138, The code currently records raw caller-supplied identifiers and full header dicts into spans via _set_span_attribute(span, SpanAttributes.GEN_AI_USER, kwargs.get("user")) and _set_span_attribute(span, SpanAttributes.GEN_AI_HEADERS, str(headers)), which can leak PII/tokens; remove these direct writes and either omit these attributes entirely or replace them with a sanitized/allowlisted representation. Update the block that reads kwargs.get("user") and kwargs.get("extra_headers")/kwargs.get("headers") so it does not call _set_span_attribute with raw values — instead either (a) do nothing, (b) set a safe boolean/enum like SpanAttributes.GEN_AI_USER_PRESENT or SpanAttributes.GEN_AI_HEADERS_PRESENT, or (c) build a sanitized dict by lowercasing header names and including only an allowlist (e.g., "content-type","user-agent","accept") and/or mask all values before calling _set_span_attribute; ensure changes are applied where _set_span_attribute, SpanAttributes.GEN_AI_USER, and SpanAttributes.GEN_AI_HEADERS are referenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.py`:
- Around line 5-7: The _FunctionToolCall TypedDict currently declares arguments
as a required key (arguments: Optional[str]) which forces the key to exist even
when absent in real payloads; update the _FunctionToolCall declaration to make
the arguments key optional by changing its annotation to
NotRequired[Optional[str]] and add/import NotRequired from typing_extensions
(same import pattern used elsewhere in this package) so emitted/test payloads
that omit the arguments key match the type model.
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py`:
- Around line 318-330: The code is referencing a non-existent semconv constant
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS which will raise AttributeError; update
the instrumentation in responses_wrappers.py to use the literal string key
"gen_ai.tool.definitions" wherever GenAIAttributes.GEN_AI_TOOL_DEFINITIONS is
used (e.g., in the calls to _set_span_attribute that set description,
parameters, name), and ensure all occurrences are consistently replaced so span
attributes are set without relying on the missing semconv constant.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py`:
- Around line 261-271: traced_response.usage may be a dict instead of an object
with attributes, so update the span-enrichment code (the block that reads
traced_response.usage and calls _set_span_attribute with
GenAIAttributes/SpanAttributes) to handle both dict and attribute-style objects:
detect with isinstance(usage, dict) and read tokens via
usage.get("input_tokens"), usage.get("output_tokens"),
usage.get("total_tokens"), and nested usage.get("input_tokens_details",
{}).get("cached_tokens") when dict; otherwise use the existing attribute access
(usage.input_tokens, usage.output_tokens, usage.total_tokens,
usage.input_tokens_details.cached_tokens). Apply the same fix to the analogous
usage-handling block later in the file (the second occurrence that currently
uses attribute access).
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py`:
- Around line 135-138: The code currently records raw caller-supplied
identifiers and full header dicts into spans via _set_span_attribute(span,
SpanAttributes.GEN_AI_USER, kwargs.get("user")) and _set_span_attribute(span,
SpanAttributes.GEN_AI_HEADERS, str(headers)), which can leak PII/tokens; remove
these direct writes and either omit these attributes entirely or replace them
with a sanitized/allowlisted representation. Update the block that reads
kwargs.get("user") and kwargs.get("extra_headers")/kwargs.get("headers") so it
does not call _set_span_attribute with raw values — instead either (a) do
nothing, (b) set a safe boolean/enum like SpanAttributes.GEN_AI_USER_PRESENT or
SpanAttributes.GEN_AI_HEADERS_PRESENT, or (c) build a sanitized dict by
lowercasing header names and including only an allowlist (e.g.,
"content-type","user-agent","accept") and/or mask all values before calling
_set_span_attribute; ensure changes are applied where _set_span_attribute,
SpanAttributes.GEN_AI_USER, and SpanAttributes.GEN_AI_HEADERS are referenced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8c9ec5df-f688-4d8a-8a41-3d6ada1449fc
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-openai/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (27)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/__init__.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/config.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/embeddings_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.pypackages/opentelemetry-instrumentation-openai/pyproject.tomlpackages/opentelemetry-instrumentation-openai/tests/conftest.pypackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yamlpackages/opentelemetry-instrumentation-openai/tests/traces/conftest.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_assistant.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_azure.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_chat.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_chat_parse.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_completions.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_embeddings.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_functions.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_messages_attributes.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_responses.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_streaming_with_api_usage.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_vision.py
✅ Files skipped from review due to trivial changes (13)
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/config.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_streaming_with_api_usage.py
- packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_chat_parse.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_assistant.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_embeddings.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_vision.py
- packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_messages_attributes.yaml
- packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py
- packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_messages_attributes.yaml
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
🚧 Files skipped from review as they are similar to previous changes (7)
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/init.py
- packages/opentelemetry-instrumentation-openai/pyproject.toml
- packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py
- packages/opentelemetry-instrumentation-openai/tests/conftest.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_completions.py
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
- packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py
...telemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.py
Outdated
Show resolved
Hide resolved
...lemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
Show resolved
Hide resolved
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_event_emitter.py (1)
68-211: Optional: parameterize event-type/prompt-state cases to reduce duplication.Both MessageEvent and ChoiceEvent suites repeat the same patch/assert scaffolding; a paramized matrix would keep intent while shrinking test maintenance surface.
♻️ Example refactor sketch
+@pytest.mark.parametrize( + "event_factory,send_prompts,tool_call_factory", + [ + ("message", True, _make_tool_call_without_arguments), + ("message", False, _make_tool_call_without_arguments), + ("message", False, _make_tool_call_with_none_arguments), + ("message", False, _make_tool_call_with_arguments), + ("choice", True, _make_tool_call_without_arguments), + ("choice", False, _make_tool_call_without_arguments), + ], +) +def test_emit_tool_call_arguments_behavior(mock_event_logger, event_factory, send_prompts, tool_call_factory): + ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_event_emitter.py` around lines 68 - 211, Combine the duplicate MessageEvent and ChoiceEvent tests into a parametric test that iterates over event classes (MessageEvent, ChoiceEvent), tool-call variants (_make_tool_call_without_arguments, _make_tool_call_with_none_arguments, _make_tool_call_with_arguments), and should_send_prompts True/False; in the single test call emit_event(event) and assert mock_event_logger.emit was called and that body["tool_calls"][0]["function"] either retains name "get_weather" when arguments are absent/should_send_prompts=True or does not contain the "arguments" key when should_send_prompts=False or arguments are None, reusing the existing helpers and patches for should_emit_events and should_send_prompts and keeping mock_event_logger fixture.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_event_emitter.py`:
- Around line 68-211: Combine the duplicate MessageEvent and ChoiceEvent tests
into a parametric test that iterates over event classes (MessageEvent,
ChoiceEvent), tool-call variants (_make_tool_call_without_arguments,
_make_tool_call_with_none_arguments, _make_tool_call_with_arguments), and
should_send_prompts True/False; in the single test call emit_event(event) and
assert mock_event_logger.emit was called and that
body["tool_calls"][0]["function"] either retains name "get_weather" when
arguments are absent/should_send_prompts=True or does not contain the
"arguments" key when should_send_prompts=False or arguments are None, reusing
the existing helpers and patches for should_emit_events and should_send_prompts
and keeping mock_event_logger fixture.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 58362de4-f7a3-4762-9218-57218fe75a46
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_event_emitter.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/event_models.py
…bility Bring back all LLM_* constants removed in v0.5.0 with their original string values. This allows non-migrated instrumentation packages to depend on semconv >=0.5.1 without code changes. - Add legacy LLM_* constants to SpanAttributes with TODO comments - Add GEN_AI_USAGE_CACHE_*_DEPRECATED for value-changed cache attrs - Update _testing.py to verify legacy constants are present - Bump semconv to 0.5.1 - Update all 32 packages to depend on >=0.5.1,<0.6.0 - Add local uv source overrides for semconv in all packages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
||
| def _map_finish_reason(reason): | ||
| if not reason: | ||
| return "stop" |
There was a problem hiding this comment.
| return "stop" | |
| return reason |
| }) | ||
| elif role == "user": | ||
| content = msg.get("content") | ||
| parts = [{"content": content, "type": "text"}] if isinstance(content, str) else (content or []) |
There was a problem hiding this comment.
U need here an util method _content_to_parts as u did in Anthropic pr, instead of passing as is.
| }) | ||
| elif role in ["system", "developer"]: | ||
| content = msg.get("content") | ||
| parts = [{"content": content, "type": "text"}] if isinstance(content, str) else (content or []) |
| }) | ||
| elif role == "assistant": | ||
| content = msg.get("content") | ||
| parts = [{"content": content, "type": "text"}] if isinstance(content, str) else (content or []) |
| if Config.use_messages_attributes: | ||
| _set_output_messages(span, choices) | ||
| else: | ||
| _legacy_set_completions(span, choices) |
There was a problem hiding this comment.
do we really need this condition?
| "parts": parts, | ||
| "finish_reason": _map_finish_reason(choice.get("finish_reason")), | ||
| }) | ||
| _set_span_attribute(span, GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, json.dumps(messages)) |
There was a problem hiding this comment.
In Anthropic pr u also added the finish reasons as top level, and not only in the messages json, so please add it here as well
| _set_span_attribute( | ||
| span, | ||
| f"{GenAIAttributes.GEN_AI_PROMPT}.{prompt_index}.content", | ||
| traced_response.instructions, | ||
| ) | ||
| _set_span_attribute(span, f"{GenAIAttributes.GEN_AI_PROMPT}.{prompt_index}.role", "system") | ||
| prompt_index += 1 |
There was a problem hiding this comment.
Isn't that the old format?
should be
input_messages.append({
"role": "system",
"parts": [{"type": "text", "content": traced_response.instructions}],
})
also make sure its covered by tests
| _set_span_attribute(span, f"{GenAIAttributes.GEN_AI_COMPLETION}.0.role", "assistant") | ||
| if traced_response.output_text: | ||
| _set_span_attribute( | ||
| span, f"{GenAIAttributes.GEN_AI_COMPLETION}.0.content", traced_response.output_text |
These dev-only overrides don't belong in the semconv branch. Only traceloop-sdk and langchain retain their pre-existing overrides. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b094ce3 to
4c08e21
Compare
Summary
Update OpenAI instrumentation to align with OTel GenAI Semantic Conventions 0.5.0.
SpanAttributesto upstreamgen_ai_attributesandopenai_attributes(gen_ai.request.frequency_penalty,gen_ai.request.presence_penalty,gen_ai.operation.name, etc.)gen_ai.openai.request.service_tier/gen_ai.openai.response.service_tierwithopenai.request.service_tier/openai.response.service_tiergen_ai.input.messages/gen_ai.output.messagessupport (use_messages_attributesflag) following the OTel JSON schema (role,parts[{type, content}],finish_reason)tool_callsfinish reason to semconv'stool_callopentelemetry-semantic-conventions-aidependency to>=0.5.0,<0.6.0andopentelemetry-semantic-conventionsto>=0.60b1Test plan
test_messages_attributes.pycovers input/output message formatting (multi-turn, tool calls, edge cases)test_chat_with_messages_attributesVCR test validates end-to-end attribute shapegpt-4.1-nano)Summary by CodeRabbit
New Features
Refactor
Tests
Chores