feat(gemini): migrate vertexai and google-generativeai to OTel GenAI semantic conventions#3840
feat(gemini): migrate vertexai and google-generativeai to OTel GenAI semantic conventions#3840avivhalfon wants to merge 2 commits intomainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR updates GenAI semantic attributes: replaces provider/system keys with Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
…semantic conventions
7c8d572 to
f839c29
Compare
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
418-433:⚠️ Potential issue | 🟠 MajorAdd required
gen_ai.operation.nameattribute to token usage histogram.The
gen_ai.client.token.usagemetric requiresgen_ai.operation.namealongside provider and token type per the OpenTelemetry GenAI specification. Without it, token usage from all operations collapses into the same metric series. AddGenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content"to both histogram record calls.Diff
attributes={ + GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content", GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai", GenAIAttributes.GEN_AI_TOKEN_TYPE: "input", GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, } @@ attributes={ + GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content", GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai", GenAIAttributes.GEN_AI_TOKEN_TYPE: "output", GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 418 - 433, The token usage histogram calls in span_utils.py are missing the required GenAIAttributes.GEN_AI_OPERATION_NAME attribute; update both token_histogram.record invocations (the ones using response.usage_metadata.prompt_token_count and response.usage_metadata.candidates_token_count) to include GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content" alongside the existing provider, token type and GenAIAttributes.GEN_AI_RESPONSE_MODEL (llm_model) attributes so metrics are separated per operation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py`:
- Around line 206-209: Update the GenAI operation attribute from "chat" to the
semconv name "generate_content" and add that operation attribute to the duration
metric attributes so the histogram includes both gen_ai.operation.name and
gen_ai.system; specifically change the attributes dict where
GenAIAttributes.GEN_AI_OPERATION_NAME is set (currently "chat") to
"generate_content", and when building the gen_ai.client.operation.duration
histogram attributes include
GenAIAttributes.GEN_AI_OPERATION_NAME="generate_content" alongside
GenAIAttributes.GEN_AI_SYSTEM; apply the same fixes for the
AsyncModels.generate_content* and Models.generate_content* wrappers (the other
occurrences referenced near the existing attributes and the sync wrapper
sections).
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py`:
- Line 31: EVENT_ATTRIBUTES currently only sets
GenAIAttributes.GEN_AI_PROVIDER_NAME and the code emits deprecated event names;
update the non-legacy path to emit the new semantic event
"gen_ai.client.inference.operation.details" instead of "gen_ai.{role}.message"
or "gen_ai.choice" and ensure EVENT_ATTRIBUTES (and the event emission call
sites) include the required attributes: GenAIAttributes.GEN_AI_PROVIDER_NAME
(value "gcp.gen_ai") and GenAIAttributes.GEN_AI_OPERATION_NAME (attribute key
"gen_ai.operation.name") plus any other required fields from the GenAI spec;
locate EVENT_ATTRIBUTES and the emit/send calls in event_emitter.py and replace
the deprecated event names with "gen_ai.client.inference.operation.details" and
populate the attribute map with the operation name and provider before emitting.
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 205-235: The messages payload currently uses a top-level "content"
field and JSON-encodes parts (producing double-serialization) and uses
{"type":"text","text":...} instead of the GenAI semantic convention; update the
three builders (the async input block that calls _process_content_item and
_process_argument, the sync input block with the same pattern, and the output
builder that uses response.text) to construct each message as {"role": <role>,
"parts": [ ... ]} where each text part is {"type":"text","content": "<plain
string>"} (do not json.dumps the inner parts — only json.dumps the final
messages list when calling _set_span_attribute). Locate and change the code that
currently creates {"role": ..., "content": json.dumps(...)} to instead build a
parts list from processed_content (convert strings to [{"type":"text","content":
...}] and ensure processed_content lists/dicts map to parts with "content"
fields) and pass that structure into _set_span_attribute with
GEN_AI_INPUT_MESSAGES (and make the analogous change for output messages that
used response.text).
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-59: The current test only asserts message role and misses
validating the required 'parts' structure; update the assertions after loading
input_messages and output_messages (the variables in this diff) to assert that
each message has a 'parts' field that is a non-empty list and that parts[0]
contains 'type' and 'content' keys (e.g., assert parts exists, len(parts) > 0,
parts[0]["type"] and parts[0]["content"] are present/non-empty) for both
input_messages[0] and output_messages[0] to enforce the OpenTelemetry GenAI
semantic convention.
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`:
- Around line 239-240: Update the GenAI semantic attributes to set
GenAIAttributes.GEN_AI_PROVIDER_NAME to "gcp.vertex_ai" instead of "vertex_ai",
and compute GenAIAttributes.GEN_AI_OPERATION_NAME dynamically in both the async
wrapper and the sync wrapper (where these attributes are currently hard-coded)
by mapping the wrapped API/method name to the correct operation token
("generate_content", "text_completion", or "chat") rather than always using
"chat"; locate the assignment sites that set
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
in the async wrapper and the sync wrapper and replace the hard-coded operation
with a small function or conditional that inspects the wrapped method name
(e.g., method.name or the API call identifier) to choose the correct operation.
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py`:
- Line 32: Update EVENT_ATTRIBUTES so GenAIAttributes.GEN_AI_PROVIDER_NAME is
set to the canonical value "gcp.vertex_ai" (replace the current "vertex_ai");
then remove or stop emitting the deprecated per-message event attributes (the
per-event keys like gen_ai.user.message, gen_ai.assistant.message,
gen_ai.choice) and instead ensure the instrumentation records chat history using
span attributes such as gen_ai.input.messages and gen_ai.output.messages where
the relevant code constructs span attributes for requests/responses (look for
usages in event_emitter.py that emit per-message events and change them to
populate the span attribute arrays accordingly).
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`:
- Around line 255-258: The current span_utils.py path only checks
kwargs.get("prompt") and misses when callers pass message=..., so modify the
logic in the span-attribute population to also check kwargs.get("message") and
normalize it the same way as the positional args handling (reuse the same
parts/normalization code path used for args) before serializing; specifically
update the block that builds messages and calls _set_span_attribute(span,
GEN_AI_INPUT_MESSAGES, json.dumps(messages)) to accept either prompt or message
(or fall back to the existing args-derived parts) so ChatSession.send_message*
invocations with message=... produce the same parts schema and populate
gen_ai.input.messages consistently.
- Around line 218-227: The span attribute payloads emitted by
_process_vertexai_argument/_set_span_attribute must follow GenAI semconv:
replace {"role":"user","content":...} with
{"role":"user","parts":[{"type":"text","content":...}]} (avoid double-encoding
JSON), ensure assistant outputs include a required "finish_reason" field
alongside {"role":"assistant","parts":[...,"finish_reason":...]} and that any
plain string content is wrapped into a parts array with
{"type":"text","content":...}; also update set_model_input_attributes to check
both kwargs.get("prompt") and kwargs.get("message") (the
ChatSession.send_message() keyword) so message= inputs get recorded into
GEN_AI_INPUT_MESSAGES. Locate and adjust the logic in
_process_vertexai_argument, the blocks that append messages (currently building
{"role","content"}), the output handling that emits plain strings, and the
set_model_input_attributes function to implement these changes.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 418-433: The token usage histogram calls in span_utils.py are
missing the required GenAIAttributes.GEN_AI_OPERATION_NAME attribute; update
both token_histogram.record invocations (the ones using
response.usage_metadata.prompt_token_count and
response.usage_metadata.candidates_token_count) to include
GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content" alongside the existing
provider, token type and GenAIAttributes.GEN_AI_RESPONSE_MODEL (llm_model)
attributes so metrics are separated per operation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d4c2161f-a9bd-4e62-aa08-e6b3f2ae5221
⛔ Files ignored due to path filters (2)
packages/opentelemetry-instrumentation-google-generativeai/uv.lockis excluded by!**/*.lockpackages/opentelemetry-instrumentation-vertexai/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/pyproject.tomlpackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.pypackages/opentelemetry-instrumentation-vertexai/pyproject.tomlpackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.pypackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.pypackages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Show resolved
Hide resolved
...ation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py
Show resolved
Hide resolved
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Show resolved
Hide resolved
| # Input messages are now a single JSON array | ||
| assert "gen_ai.input.messages" in attrs | ||
| input_messages = json.loads(attrs["gen_ai.input.messages"]) | ||
| assert len(input_messages) > 0 | ||
| assert input_messages[0]["role"] == "user" | ||
|
|
||
| assert "gen_ai.completion.0.content" in attrs | ||
| assert attrs["gen_ai.completion.0.role"] == "assistant" | ||
| # Output messages are now a single JSON array | ||
| assert "gen_ai.output.messages" in attrs | ||
| output_messages = json.loads(attrs["gen_ai.output.messages"]) | ||
| assert len(output_messages) > 0 | ||
| assert output_messages[0]["role"] == "assistant" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd --type f --name "*.py" | grep -E "(google.*generative|instrumentation)" | head -20Repository: traceloop/openllmetry
Length of output: 315
🏁 Script executed:
git ls-files | grep -i "google.*generative" | head -30Repository: traceloop/openllmetry
Length of output: 2059
🏁 Script executed:
find . -type f -name "test_generate_content.py" 2>/dev/nullRepository: traceloop/openllmetry
Length of output: 156
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py | head -100Repository: traceloop/openllmetry
Length of output: 4145
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | head -200Repository: traceloop/openllmetry
Length of output: 8688
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '196,350p'Repository: traceloop/openllmetry
Length of output: 7531
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '350,450p'Repository: traceloop/openllmetry
Length of output: 3665
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '372,392p'Repository: traceloop/openllmetry
Length of output: 1022
🏁 Script executed:
cd packages/opentelemetry-instrumentation-google-generativeai && find . -type f -name "*.py" -exec grep -l "parts\|GEN_AI_INPUT_MESSAGES\|GEN_AI_OUTPUT_MESSAGES" {} \;Repository: traceloop/openllmetry
Length of output: 199
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_models.pyRepository: traceloop/openllmetry
Length of output: 1227
🏁 Script executed:
web_search "OpenTelemetry GenAI semantic convention message format specification"Repository: traceloop/openllmetry
Length of output: 115
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py | sed -n '40,75p'Repository: traceloop/openllmetry
Length of output: 1726
🌐 Web query:
OpenTelemetry GenAI semantic convention gen_ai.input.messages gen_ai.output.messages message schema
💡 Result:
gen_ai.input.messages (Input messages JSON schema)
gen_ai.input.messages is an array of chat message objects (“InputMessages”). Each item is a ChatMessage with: role + parts (required), and optional name. [1]
[
{
"role": "system" | "user" | "assistant" | "tool" | "<provider-specific>",
"name": "<optional participant name>",
"parts": [
// one or more message parts (see below)
]
}
]Supported parts[] variants (polymorphic by type) include: [1]
{"type":"text","content":string}{"type":"tool_call","id":string|null,"name":string,"arguments":any}{"type":"tool_call_response","id":string|null,"response":any}{"type":"server_tool_call","id":string|null,"name":string,"server_tool_call":{...}}{"type":"server_tool_call_response","id":string|null,"server_tool_call_response":{...}}{"type":"blob","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"content":"<base64 when JSON>"}{"type":"file","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"file_id":string}{"type":"uri","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"uri":string}{"type":"reasoning","content":string}{"type":"<any>","...": "..."} // GenericPart (extensible)
gen_ai.output.messages (Output messages JSON schema)
gen_ai.output.messages is an array of output message objects (“OutputMessages”). Each item is an OutputMessage with: role, parts, and required finish_reason; optional name. [2][3]
[
{
"role": "assistant" | "tool" | "system" | "user" | "<provider-specific>",
"name": "<optional participant name>",
"parts": [
// same part variants as input.messages
],
"finish_reason": "stop" | "length" | "content_filter" | "tool_call" | "error" | "<provider-specific>"
}
]Notes from the semantic conventions text:
- Input messages must be in the order sent to the model. [2]
- Each output message corresponds to exactly one model “choice/candidate.” [2]
Sources: Input schema JSON [1], Output schema JSON [3], semconv GenAI events page (links to both schemas + behavioral notes) [2].
🏁 Script executed:
cat packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '205,235p'Repository: traceloop/openllmetry
Length of output: 1251
🏁 Script executed:
cat packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '126,164p'Repository: traceloop/openllmetry
Length of output: 1654
🏁 Script executed:
python3 << 'EOF'
import json
# Simulate what the code does
processed_content = [{"type": "text", "text": "What is ai?"}]
message = {
"role": "user",
"content": json.dumps(processed_content) # This is what the code does at line 217, 225, etc.
}
print("Message structure being created:")
print(json.dumps(message, indent=2))
print("\nAfter json.dumps at line 234:")
full_messages = [message]
print(json.dumps(full_messages))
print("\nWhat gets stored in span attribute:")
attr_value = json.dumps(full_messages)
print(attr_value)
print("\nWhat test sees after json.loads:")
test_sees = json.loads(attr_value)
print(json.dumps(test_sees, indent=2))
print(f"\ntest_sees[0] has keys: {test_sees[0].keys()}")
print(f"test_sees[0]['role']: {test_sees[0]['role']}")
print(f"Does test_sees[0] have 'parts'? {'parts' in test_sees[0]}")
print(f"Does test_sees[0] have 'content'? {'content' in test_sees[0]}")
# What spec requires
print("\n\n=== SPEC REQUIRES ===")
spec_compliant = {
"role": "user",
"parts": [{"type": "text", "content": "What is ai?"}]
}
print("Spec-compliant structure:")
print(json.dumps(spec_compliant, indent=2))
EOFRepository: traceloop/openllmetry
Length of output: 864
The test assertions are too weak to catch the message-format violation.
The spec requires parts field (see OpenTelemetry GenAI semantic convention), but the current assertions only check role. Add assertions for parts[0]["type"] and parts[0]["content"] to validate the message structure complies with the semantic convention and catch any regressions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 49 - 59, The current test only asserts message role and misses
validating the required 'parts' structure; update the assertions after loading
input_messages and output_messages (the variables in this diff) to assert that
each message has a 'parts' field that is a non-empty list and that parts[0]
contains 'type' and 'content' keys (e.g., assert parts exists, len(parts) > 0,
parts[0]["type"] and parts[0]["content"] are present/non-empty) for both
input_messages[0] and output_messages[0] to enforce the OpenTelemetry GenAI
semantic convention.
| GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai", | ||
| GenAIAttributes.GEN_AI_OPERATION_NAME: "chat", |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenTelemetry gen_ai.operation.name predefined values and well-known provider.name value for Vertex AI
💡 Result:
gen_ai.operation.name predefined (well-known) values
chatcreate_agentembeddingsexecute_toolgenerate_contentinvoke_agentretrievaltext_completion(opentelemetry.io)
gen_ai.provider.name well-known value for Vertex AI
gcp.vertex_ai(used when accessingaiplatform.googleapis.com) (opentelemetry.io)
Citations:
- 1: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
- 2: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
🏁 Script executed:
# Find the Vertex AI instrumentation file and check the wrapped methods
cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | head -100Repository: traceloop/openllmetry
Length of output: 4037
🏁 Script executed:
# Check lines around 239-240 and 288-289
cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | sed -n '235,245p'Repository: traceloop/openllmetry
Length of output: 448
🏁 Script executed:
# Check lines around 288-289
cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | sed -n '284,294p'Repository: traceloop/openllmetry
Length of output: 511
🏁 Script executed:
# Search for WRAPPED_METHODS definition
rg "WRAPPED_METHODS" packages/opentelemetry-instrumentation-vertexai/ -A 10 -B 2Repository: traceloop/openllmetry
Length of output: 4728
Use correct gen_ai.provider.name and derive gen_ai.operation.name from the wrapped API instead of hard-coding chat.
The GenAI semantic conventions require specific operation values—generate_content, text_completion, and chat—based on the API being called. Hard-coding "chat" for all methods mislabels generate_content* and predict* operations and collapses distinct operation types into one bucket. Additionally, the provider should be gcp.vertex_ai, not vertex_ai, per the OpenTelemetry registry.
Applies to lines 239-240 (async wrapper) and 288-289 (sync wrapper).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`
around lines 239 - 240, Update the GenAI semantic attributes to set
GenAIAttributes.GEN_AI_PROVIDER_NAME to "gcp.vertex_ai" instead of "vertex_ai",
and compute GenAIAttributes.GEN_AI_OPERATION_NAME dynamically in both the async
wrapper and the sync wrapper (where these attributes are currently hard-coded)
by mapping the wrapped API/method name to the correct operation token
("generate_content", "text_completion", or "chat") rather than always using
"chat"; locate the assignment sites that set
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
in the async wrapper and the sync wrapper and replace the hard-coded operation
with a small function or conditional that inspects the wrapped method name
(e.g., method.name or the API call identifier) to choose the correct operation.
...entelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py
Show resolved
Hide resolved
.../opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
Show resolved
Hide resolved
.../opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py (1)
239-240:⚠️ Potential issue | 🟠 MajorUse semconv-defined provider and operation mapping instead of hard-coded values.
GEN_AI_PROVIDER_NAMEshould use the Vertex AI well-known token, andGEN_AI_OPERATION_NAMEshould be derived from the wrapped method (generate_content*/predict*/send_message*) instead of always"chat". Current values collapse distinct operation types and skew telemetry dimensions. This also affects updated assertions inpackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.pyandpackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.py.Proposed fix
+def _resolve_genai_operation_name(to_wrap): + method = (to_wrap or {}).get("method", "") + if method.startswith("generate_content"): + return "generate_content" + if method.startswith("predict"): + return "text_completion" + if method.startswith("send_message"): + return "chat" + return "chat" + @@ attributes={ - GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai", - GenAIAttributes.GEN_AI_OPERATION_NAME: "chat", + GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.vertex_ai", + GenAIAttributes.GEN_AI_OPERATION_NAME: _resolve_genai_operation_name( + to_wrap + ), }, @@ attributes={ - GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai", - GenAIAttributes.GEN_AI_OPERATION_NAME: "chat", + GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.vertex_ai", + GenAIAttributes.GEN_AI_OPERATION_NAME: _resolve_genai_operation_name( + to_wrap + ), },OpenTelemetry GenAI semantic conventions (latest): what are the well-known values for `gen_ai.provider.name` and `gen_ai.operation.name`, and which operation values should map to Vertex AI APIs `generate_content*`, `predict*`, and `send_message*`?Based on learnings: Follow the OpenTelemetry GenAI semantic specification at https://opentelemetry.io/docs/specs/semconv/gen-ai/
Also applies to: 288-289
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py` around lines 239 - 240, Replace the hard-coded GenAIAttributes.GEN_AI_PROVIDER_NAME and GEN_AI_ATTRIBUTES.GEN_AI_OPERATION_NAME values with the semconv-defined provider token for Vertex AI and a derived operation mapping based on the wrapped method name: map methods starting with "generate_content" -> operation "generate", "predict" -> "predict", and "send_message" -> "chat" (use the exact OpenTelemetry GenAI semantic-convention tokens for provider and operations). Implement this logic where GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME are set (the block that currently sets "vertex_ai" and "chat") so provider uses the canonical semconv constant and operation is chosen by checking the wrapped method name (e.g., wrapped_method_name.startswith("generate_content"), .startswith("predict"), .startswith("send_message")); update the related assertions in the disabled_test_bison.py and disabled_test_gemini.py fixtures to expect the semconv provider token and the mapped operation values. Ensure all references to the old hard-coded strings are replaced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 209-210: The branch that handles a plain string (currently
appending {"role": "user", "content": contents}) is inconsistent with other
branches that JSON-encode an array of parts; update that branch so it
JSON-encodes the string into an array-of-parts before appending (e.g., use
json.dumps([...]) to produce the same array-of-parts shape used in the other
branches), ensuring the messages list always has content as a JSON-encoded
array; locate this change where the variable contents is handled and messages is
appended in span_utils.py and mirror the encoding logic used in the other
branches.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`:
- Around line 239-240: Replace the hard-coded
GenAIAttributes.GEN_AI_PROVIDER_NAME and GEN_AI_ATTRIBUTES.GEN_AI_OPERATION_NAME
values with the semconv-defined provider token for Vertex AI and a derived
operation mapping based on the wrapped method name: map methods starting with
"generate_content" -> operation "generate", "predict" -> "predict", and
"send_message" -> "chat" (use the exact OpenTelemetry GenAI semantic-convention
tokens for provider and operations). Implement this logic where
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
are set (the block that currently sets "vertex_ai" and "chat") so provider uses
the canonical semconv constant and operation is chosen by checking the wrapped
method name (e.g., wrapped_method_name.startswith("generate_content"),
.startswith("predict"), .startswith("send_message")); update the related
assertions in the disabled_test_bison.py and disabled_test_gemini.py fixtures to
expect the semconv provider token and the mapped operation values. Ensure all
references to the old hard-coded strings are replaced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 45b5433d-73a3-4460-bc3f-35d3bba6e3af
⛔ Files ignored due to path filters (2)
packages/opentelemetry-instrumentation-google-generativeai/uv.lockis excluded by!**/*.lockpackages/opentelemetry-instrumentation-vertexai/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/pyproject.tomlpackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.pypackages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.pypackages/opentelemetry-instrumentation-vertexai/pyproject.tomlpackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.pypackages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.pypackages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py
✅ Files skipped from review due to trivial changes (4)
- packages/opentelemetry-instrumentation-vertexai/pyproject.toml
- packages/opentelemetry-instrumentation-google-generativeai/pyproject.toml
- packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py
- packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/init.py
- packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Show resolved
Hide resolved
| GenAIAttributes.GEN_AI_SYSTEM: "Google", | ||
| SpanAttributes.LLM_REQUEST_TYPE: LLMRequestTypeValues.COMPLETION.value, | ||
| GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai", | ||
| GenAIAttributes.GEN_AI_OPERATION_NAME: "chat", |
There was a problem hiding this comment.
use GenAiOperationNameValues.TEXT_COMPLETION.value instead of hardcoded value, and why change to chat form completion? a mistake from the beginning?
| attributes={ | ||
| GenAIAttributes.GEN_AI_SYSTEM: "Google", | ||
| SpanAttributes.LLM_REQUEST_TYPE: LLMRequestTypeValues.COMPLETION.value, | ||
| GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai", |
There was a problem hiding this comment.
cant you import from GenAiSystemValues.<gcp>.value?
| GEN_AI_INPUT_MESSAGES = "gen_ai.input.messages" | ||
| GEN_AI_OUTPUT_MESSAGES = "gen_ai.output.messages" |
There was a problem hiding this comment.
import from
GenAIAttributes.GEN_AI_INPUT_MESSAGES
There was a problem hiding this comment.
do you know why the revision changed and upload time added in many places?
| GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai", | ||
| GenAIAttributes.GEN_AI_OPERATION_NAME: "chat", |
There was a problem hiding this comment.
same comments from google genai
Related to #3836
Summary by CodeRabbit
New Features
Chores
Tests