feat(gemini): migrate vertexai and google-generativeai to OTel GenAI semantic conventions by avivhalfon · Pull Request #3840 · traceloop/openllmetry

avivhalfon · 2026-03-22T15:35:28Z

Related to #3836

Summary by CodeRabbit

New Features
- Consolidated input and output message logging into JSON-encoded message arrays for clearer observability.
Chores
- Standardized provider and operation naming across Google Generative AI and Vertex AI instrumentations (provider names and operation keys).
- Bumped semantic-conventions-ai constraint to v0.5.x and realigned span attribute keys to updated conventions.
Tests
- Updated tests to validate JSON message attributes and the revised provider/operation/span attribute names.

CLAassistant · 2026-03-22T15:35:36Z

All committers have signed the CLA.

coderabbitai · 2026-03-22T15:35:51Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 654356d2-313f-4311-9e68-c0804222ff76

📥 Commits

Reviewing files that changed from the base of the PR and between f839c29 and c13d0e2.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

🚧 Files skipped from review as they are similar to previous changes (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/init.py

📝 Walkthrough

Walkthrough

This PR updates GenAI semantic attributes: replaces provider/system keys with GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME, consolidates per-index prompt/completion attributes into JSON arrays under gen_ai.input.messages and gen_ai.output.messages, and bumps opentelemetry-semantic-conventions-ai to ^0.5.x.

Changes

Cohort / File(s)	Summary
Google GenerativeAI: Core Instrumentation `packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py`	Replaced `GEN_AI_SYSTEM`/LLM request-type attributes with `GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai"` and `GenAIAttributes.GEN_AI_OPERATION_NAME: "chat"` in span creation and duration histograms.
Google GenerativeAI: Event Attributes `packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py`	`EVENT_ATTRIBUTES` now emits `GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai"` instead of `GEN_AI_SYSTEM: "gemini"`.
Google GenerativeAI: Span Message Consolidation `packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`	Introduced `gen_ai.input.messages` and `gen_ai.output.messages` as JSON arrays of {role, content}; removed per-index prompt/completion attributes; updated penalty/token semconv keys and histogram provider label to `gcp.gen_ai`.
Google GenerativeAI: Dependency & Tests `packages/opentelemetry-instrumentation-google-generativeai/pyproject.toml`, `packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`	Bumped `opentelemetry-semantic-conventions-ai` to `>=0.5.0,<0.6.0`. Tests adjusted to assert `GEN_AI_PROVIDER_NAME == "gcp.gen_ai"`, `GEN_AI_OPERATION_NAME == "chat"`, and to parse `gen_ai.input.messages` / `gen_ai.output.messages` (added `import json`).
VertexAI: Core Instrumentation `packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`	Replaced `GEN_AI_SYSTEM`/LLM request-type attributes with `GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai"` and `GenAIAttributes.GEN_AI_OPERATION_NAME: "chat"` in client span creation for async/sync wrappers.
VertexAI: Event Attributes `packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py`	`EVENT_ATTRIBUTES` now emits `GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai"` instead of enum-based `GEN_AI_SYSTEM`.
VertexAI: Span Message Consolidation `packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`	Added `gen_ai.input.messages` and `gen_ai.output.messages`; aggregate prompt/response into JSON message arrays; mapped penalty and token semconv keys to new GenAI attribute names.
VertexAI: Dependency & Tests `packages/opentelemetry-instrumentation-vertexai/pyproject.toml`, `packages/opentelemetry-instrumentation-vertexai/tests/...`	Bumped `opentelemetry-semantic-conventions-ai` to `>=0.5.0,<0.6.0`. Tests updated to parse `gen_ai.input.messages` / `gen_ai.output.messages`, use `GEN_AI_PROVIDER_NAME == "vertex_ai"`, and use updated token/penalty attribute keys (added `import json`).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled old keys into new,

Bundled messages into one neat view,
Provider names hopped into their place,
Spans hum JSON, tidy and ace,
A small refactor — a rabbit’s trace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 48.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main objective: migrating Vertex AI and google-generativeai integrations to OTel GenAI semantic conventions, which matches the extensive refactoring across both packages.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ah/gemini-sem-conv

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…semantic conventions

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

418-433: ⚠️ Potential issue | 🟠 Major

Add required gen_ai.operation.name attribute to token usage histogram.

The gen_ai.client.token.usage metric requires gen_ai.operation.name alongside provider and token type per the OpenTelemetry GenAI specification. Without it, token usage from all operations collapses into the same metric series. Add GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content" to both histogram record calls.

Diff

             attributes={
+                GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content",
                 GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai",
                 GenAIAttributes.GEN_AI_TOKEN_TYPE: "input",
                 GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model,
             }
@@
             attributes={
+                GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content",
                 GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai",
                 GenAIAttributes.GEN_AI_TOKEN_TYPE: "output",
                 GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model,
             },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 418 - 433, The token usage histogram calls in span_utils.py are
missing the required GenAIAttributes.GEN_AI_OPERATION_NAME attribute; update
both token_histogram.record invocations (the ones using
response.usage_metadata.prompt_token_count and
response.usage_metadata.candidates_token_count) to include
GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content" alongside the existing
provider, token type and GenAIAttributes.GEN_AI_RESPONSE_MODEL (llm_model)
attributes so metrics are separated per operation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py`:
- Around line 206-209: Update the GenAI operation attribute from "chat" to the
semconv name "generate_content" and add that operation attribute to the duration
metric attributes so the histogram includes both gen_ai.operation.name and
gen_ai.system; specifically change the attributes dict where
GenAIAttributes.GEN_AI_OPERATION_NAME is set (currently "chat") to
"generate_content", and when building the gen_ai.client.operation.duration
histogram attributes include
GenAIAttributes.GEN_AI_OPERATION_NAME="generate_content" alongside
GenAIAttributes.GEN_AI_SYSTEM; apply the same fixes for the
AsyncModels.generate_content* and Models.generate_content* wrappers (the other
occurrences referenced near the existing attributes and the sync wrapper
sections).

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py`:
- Line 31: EVENT_ATTRIBUTES currently only sets
GenAIAttributes.GEN_AI_PROVIDER_NAME and the code emits deprecated event names;
update the non-legacy path to emit the new semantic event
"gen_ai.client.inference.operation.details" instead of "gen_ai.{role}.message"
or "gen_ai.choice" and ensure EVENT_ATTRIBUTES (and the event emission call
sites) include the required attributes: GenAIAttributes.GEN_AI_PROVIDER_NAME
(value "gcp.gen_ai") and GenAIAttributes.GEN_AI_OPERATION_NAME (attribute key
"gen_ai.operation.name") plus any other required fields from the GenAI spec;
locate EVENT_ATTRIBUTES and the emit/send calls in event_emitter.py and replace
the deprecated event names with "gen_ai.client.inference.operation.details" and
populate the attribute map with the operation name and provider before emitting.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 205-235: The messages payload currently uses a top-level "content"
field and JSON-encodes parts (producing double-serialization) and uses
{"type":"text","text":...} instead of the GenAI semantic convention; update the
three builders (the async input block that calls _process_content_item and
_process_argument, the sync input block with the same pattern, and the output
builder that uses response.text) to construct each message as {"role": <role>,
"parts": [ ... ]} where each text part is {"type":"text","content": "<plain
string>"} (do not json.dumps the inner parts — only json.dumps the final
messages list when calling _set_span_attribute). Locate and change the code that
currently creates {"role": ..., "content": json.dumps(...)} to instead build a
parts list from processed_content (convert strings to [{"type":"text","content":
...}] and ensure processed_content lists/dicts map to parts with "content"
fields) and pass that structure into _set_span_attribute with
GEN_AI_INPUT_MESSAGES (and make the analogous change for output messages that
used response.text).

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-59: The current test only asserts message role and misses
validating the required 'parts' structure; update the assertions after loading
input_messages and output_messages (the variables in this diff) to assert that
each message has a 'parts' field that is a non-empty list and that parts[0]
contains 'type' and 'content' keys (e.g., assert parts exists, len(parts) > 0,
parts[0]["type"] and parts[0]["content"] are present/non-empty) for both
input_messages[0] and output_messages[0] to enforce the OpenTelemetry GenAI
semantic convention.

In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`:
- Around line 239-240: Update the GenAI semantic attributes to set
GenAIAttributes.GEN_AI_PROVIDER_NAME to "gcp.vertex_ai" instead of "vertex_ai",
and compute GenAIAttributes.GEN_AI_OPERATION_NAME dynamically in both the async
wrapper and the sync wrapper (where these attributes are currently hard-coded)
by mapping the wrapped API/method name to the correct operation token
("generate_content", "text_completion", or "chat") rather than always using
"chat"; locate the assignment sites that set
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
in the async wrapper and the sync wrapper and replace the hard-coded operation
with a small function or conditional that inspects the wrapped method name
(e.g., method.name or the API call identifier) to choose the correct operation.

In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py`:
- Line 32: Update EVENT_ATTRIBUTES so GenAIAttributes.GEN_AI_PROVIDER_NAME is
set to the canonical value "gcp.vertex_ai" (replace the current "vertex_ai");
then remove or stop emitting the deprecated per-message event attributes (the
per-event keys like gen_ai.user.message, gen_ai.assistant.message,
gen_ai.choice) and instead ensure the instrumentation records chat history using
span attributes such as gen_ai.input.messages and gen_ai.output.messages where
the relevant code constructs span attributes for requests/responses (look for
usages in event_emitter.py that emit per-message events and change them to
populate the span attribute arrays accordingly).

In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`:
- Around line 255-258: The current span_utils.py path only checks
kwargs.get("prompt") and misses when callers pass message=..., so modify the
logic in the span-attribute population to also check kwargs.get("message") and
normalize it the same way as the positional args handling (reuse the same
parts/normalization code path used for args) before serializing; specifically
update the block that builds messages and calls _set_span_attribute(span,
GEN_AI_INPUT_MESSAGES, json.dumps(messages)) to accept either prompt or message
(or fall back to the existing args-derived parts) so ChatSession.send_message*
invocations with message=... produce the same parts schema and populate
gen_ai.input.messages consistently.
- Around line 218-227: The span attribute payloads emitted by
_process_vertexai_argument/_set_span_attribute must follow GenAI semconv:
replace {"role":"user","content":...} with
{"role":"user","parts":[{"type":"text","content":...}]} (avoid double-encoding
JSON), ensure assistant outputs include a required "finish_reason" field
alongside {"role":"assistant","parts":[...,"finish_reason":...]} and that any
plain string content is wrapped into a parts array with
{"type":"text","content":...}; also update set_model_input_attributes to check
both kwargs.get("prompt") and kwargs.get("message") (the
ChatSession.send_message() keyword) so message= inputs get recorded into
GEN_AI_INPUT_MESSAGES. Locate and adjust the logic in
_process_vertexai_argument, the blocks that append messages (currently building
{"role","content"}), the output handling that emits plain strings, and the
set_model_input_attributes function to implement these changes.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 418-433: The token usage histogram calls in span_utils.py are
missing the required GenAIAttributes.GEN_AI_OPERATION_NAME attribute; update
both token_histogram.record invocations (the ones using
response.usage_metadata.prompt_token_count and
response.usage_metadata.candidates_token_count) to include
GenAIAttributes.GEN_AI_OPERATION_NAME: "generate_content" alongside the existing
provider, token type and GenAIAttributes.GEN_AI_RESPONSE_MODEL (llm_model)
attributes so metrics are separated per operation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d4c2161f-a9bd-4e62-aa08-e6b3f2ae5221

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2418b and 7c8d572.

⛔ Files ignored due to path filters (2)

packages/opentelemetry-instrumentation-google-generativeai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-vertexai/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
packages/opentelemetry-instrumentation-google-generativeai/pyproject.toml
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
packages/opentelemetry-instrumentation-vertexai/pyproject.toml
packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.py
packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.py
packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

...ation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

coderabbitai · 2026-03-22T15:51:02Z

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

+    # Input messages are now a single JSON array
+    assert "gen_ai.input.messages" in attrs
+    input_messages = json.loads(attrs["gen_ai.input.messages"])
+    assert len(input_messages) > 0
+    assert input_messages[0]["role"] == "user"

-    assert "gen_ai.completion.0.content" in attrs
-    assert attrs["gen_ai.completion.0.role"] == "assistant"
+    # Output messages are now a single JSON array
+    assert "gen_ai.output.messages" in attrs
+    output_messages = json.loads(attrs["gen_ai.output.messages"])
+    assert len(output_messages) > 0
+    assert output_messages[0]["role"] == "assistant"


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd --type f --name "*.py" | grep -E "(google.*generative|instrumentation)" | head -20

Repository: traceloop/openllmetry

Length of output: 315

🏁 Script executed:

git ls-files | grep -i "google.*generative" | head -30

Repository: traceloop/openllmetry

Length of output: 2059

🏁 Script executed:

find . -type f -name "test_generate_content.py" 2>/dev/null

Repository: traceloop/openllmetry

Length of output: 156

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py | head -100

Repository: traceloop/openllmetry

Length of output: 4145

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | head -200

Repository: traceloop/openllmetry

Length of output: 8688

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '196,350p'

Repository: traceloop/openllmetry

Length of output: 7531

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '350,450p'

Repository: traceloop/openllmetry

Length of output: 3665

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '372,392p'

Repository: traceloop/openllmetry

Length of output: 1022

🏁 Script executed:

cd packages/opentelemetry-instrumentation-google-generativeai && find . -type f -name "*.py" -exec grep -l "parts\|GEN_AI_INPUT_MESSAGES\|GEN_AI_OUTPUT_MESSAGES" {} \;

Repository: traceloop/openllmetry

Length of output: 199

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_models.py

Repository: traceloop/openllmetry

Length of output: 1227

🏁 Script executed:

web_search "OpenTelemetry GenAI semantic convention message format specification"

Repository: traceloop/openllmetry

Length of output: 115

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py | sed -n '40,75p'

Repository: traceloop/openllmetry

Length of output: 1726

🌐 Web query:

OpenTelemetry GenAI semantic convention gen_ai.input.messages gen_ai.output.messages message schema

💡 Result:

gen_ai.input.messages (Input messages JSON schema)

gen_ai.input.messages is an array of chat message objects (“InputMessages”). Each item is a ChatMessage with: role + parts (required), and optional name. [1]

[ { "role": "system" | "user" | "assistant" | "tool" | "<provider-specific>", "name": "<optional participant name>", "parts": [ // one or more message parts (see below) ] } ]

Supported parts[] variants (polymorphic by type) include: [1]

{"type":"text","content":string}

{"type":"tool_call","id":string|null,"name":string,"arguments":any}

{"type":"tool_call_response","id":string|null,"response":any}

{"type":"server_tool_call","id":string|null,"name":string,"server_tool_call":{...}}

{"type":"server_tool_call_response","id":string|null,"server_tool_call_response":{...}}

{"type":"blob","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"content":"<base64 when JSON>"}

{"type":"file","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"file_id":string}

{"type":"uri","mime_type":string|null,"modality":"image"|"video"|"audio"|string,"uri":string}

{"type":"reasoning","content":string}

{"type":"<any>","...": "..."} // GenericPart (extensible)

gen_ai.output.messages (Output messages JSON schema)

gen_ai.output.messages is an array of output message objects (“OutputMessages”). Each item is an OutputMessage with: role, parts, and required finish_reason; optional name. [2][3]

[ { "role": "assistant" | "tool" | "system" | "user" | "<provider-specific>", "name": "<optional participant name>", "parts": [ // same part variants as input.messages ], "finish_reason": "stop" | "length" | "content_filter" | "tool_call" | "error" | "<provider-specific>" } ]

Notes from the semantic conventions text:

Input messages must be in the order sent to the model. [2]

Each output message corresponds to exactly one model “choice/candidate.” [2]

Sources: Input schema JSON [1], Output schema JSON [3], semconv GenAI events page (links to both schemas + behavioral notes) [2].

🏁 Script executed:

cat packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '205,235p'

Repository: traceloop/openllmetry

Length of output: 1251

🏁 Script executed:

cat packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py | sed -n '126,164p'

Repository: traceloop/openllmetry

Length of output: 1654

🏁 Script executed:

python3 << 'EOF' import json # Simulate what the code does processed_content = [{"type": "text", "text": "What is ai?"}] message = { "role": "user", "content": json.dumps(processed_content) # This is what the code does at line 217, 225, etc. } print("Message structure being created:") print(json.dumps(message, indent=2)) print("\nAfter json.dumps at line 234:") full_messages = [message] print(json.dumps(full_messages)) print("\nWhat gets stored in span attribute:") attr_value = json.dumps(full_messages) print(attr_value) print("\nWhat test sees after json.loads:") test_sees = json.loads(attr_value) print(json.dumps(test_sees, indent=2)) print(f"\ntest_sees[0] has keys: {test_sees[0].keys()}") print(f"test_sees[0]['role']: {test_sees[0]['role']}") print(f"Does test_sees[0] have 'parts'? {'parts' in test_sees[0]}") print(f"Does test_sees[0] have 'content'? {'content' in test_sees[0]}") # What spec requires print("\n\n=== SPEC REQUIRES ===") spec_compliant = { "role": "user", "parts": [{"type": "text", "content": "What is ai?"}] } print("Spec-compliant structure:") print(json.dumps(spec_compliant, indent=2)) EOF

Repository: traceloop/openllmetry

Length of output: 864

The test assertions are too weak to catch the message-format violation.

The spec requires parts field (see OpenTelemetry GenAI semantic convention), but the current assertions only check role. Add assertions for parts[0]["type"] and parts[0]["content"] to validate the message structure complies with the semantic convention and catch any regressions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py` around lines 49 - 59, The current test only asserts message role and misses validating the required 'parts' structure; update the assertions after loading input_messages and output_messages (the variables in this diff) to assert that each message has a 'parts' field that is a non-empty list and that parts[0] contains 'type' and 'content' keys (e.g., assert parts exists, len(parts) > 0, parts[0]["type"] and parts[0]["content"] are present/non-empty) for both input_messages[0] and output_messages[0] to enforce the OpenTelemetry GenAI semantic convention.

coderabbitai · 2026-03-22T15:51:02Z

...es/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py

+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai",
+            GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenTelemetry gen_ai.operation.name predefined values and well-known provider.name value for Vertex AI

💡 Result:

gen_ai.operation.name predefined (well-known) values

chat

create_agent

embeddings

execute_tool

generate_content

invoke_agent

retrieval

text_completion (opentelemetry.io)

gen_ai.provider.name well-known value for Vertex AI

gcp.vertex_ai (used when accessing aiplatform.googleapis.com) (opentelemetry.io)

Citations:

1: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/

2: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/

🏁 Script executed:

# Find the Vertex AI instrumentation file and check the wrapped methods cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | head -100

Repository: traceloop/openllmetry

Length of output: 4037

🏁 Script executed:

# Check lines around 239-240 and 288-289 cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | sed -n '235,245p'

Repository: traceloop/openllmetry

Length of output: 448

🏁 Script executed:

# Check lines around 288-289 cat -n packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py | sed -n '284,294p'

Repository: traceloop/openllmetry

Length of output: 511

🏁 Script executed:

# Search for WRAPPED_METHODS definition rg "WRAPPED_METHODS" packages/opentelemetry-instrumentation-vertexai/ -A 10 -B 2

Repository: traceloop/openllmetry

Length of output: 4728

Use correct gen_ai.provider.name and derive gen_ai.operation.name from the wrapped API instead of hard-coding chat.

The GenAI semantic conventions require specific operation values—generate_content, text_completion, and chat—based on the API being called. Hard-coding "chat" for all methods mislabels generate_content* and predict* operations and collapses distinct operation types into one bucket. Additionally, the provider should be gcp.vertex_ai, not vertex_ai, per the OpenTelemetry registry.

Applies to lines 239-240 (async wrapper) and 288-289 (sync wrapper).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py` around lines 239 - 240, Update the GenAI semantic attributes to set GenAIAttributes.GEN_AI_PROVIDER_NAME to "gcp.vertex_ai" instead of "vertex_ai", and compute GenAIAttributes.GEN_AI_OPERATION_NAME dynamically in both the async wrapper and the sync wrapper (where these attributes are currently hard-coded) by mapping the wrapped API/method name to the correct operation token ("generate_content", "text_completion", or "chat") rather than always using "chat"; locate the assignment sites that set GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME in the async wrapper and the sync wrapper and replace the hard-coded operation with a small function or conditional that inspects the wrapped method name (e.g., method.name or the API call identifier) to choose the correct operation.

...entelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py

.../opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py (1)

239-240: ⚠️ Potential issue | 🟠 Major

Use semconv-defined provider and operation mapping instead of hard-coded values.

GEN_AI_PROVIDER_NAME should use the Vertex AI well-known token, and GEN_AI_OPERATION_NAME should be derived from the wrapped method (generate_content* / predict* / send_message*) instead of always "chat". Current values collapse distinct operation types and skew telemetry dimensions. This also affects updated assertions in packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.py and packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.py.

Proposed fix

+def _resolve_genai_operation_name(to_wrap):
+    method = (to_wrap or {}).get("method", "")
+    if method.startswith("generate_content"):
+        return "generate_content"
+    if method.startswith("predict"):
+        return "text_completion"
+    if method.startswith("send_message"):
+        return "chat"
+    return "chat"
+
@@
         attributes={
-            GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai",
-            GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",
+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.vertex_ai",
+            GenAIAttributes.GEN_AI_OPERATION_NAME: _resolve_genai_operation_name(
+                to_wrap
+            ),
         },
@@
         attributes={
-            GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai",
-            GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",
+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.vertex_ai",
+            GenAIAttributes.GEN_AI_OPERATION_NAME: _resolve_genai_operation_name(
+                to_wrap
+            ),
         },

OpenTelemetry GenAI semantic conventions (latest): what are the well-known values for `gen_ai.provider.name` and `gen_ai.operation.name`, and which operation values should map to Vertex AI APIs `generate_content*`, `predict*`, and `send_message*`?

Based on learnings: Follow the OpenTelemetry GenAI semantic specification at https://opentelemetry.io/docs/specs/semconv/gen-ai/

Also applies to: 288-289

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`
around lines 239 - 240, Replace the hard-coded
GenAIAttributes.GEN_AI_PROVIDER_NAME and GEN_AI_ATTRIBUTES.GEN_AI_OPERATION_NAME
values with the semconv-defined provider token for Vertex AI and a derived
operation mapping based on the wrapped method name: map methods starting with
"generate_content" -> operation "generate", "predict" -> "predict", and
"send_message" -> "chat" (use the exact OpenTelemetry GenAI semantic-convention
tokens for provider and operations). Implement this logic where
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
are set (the block that currently sets "vertex_ai" and "chat") so provider uses
the canonical semconv constant and operation is chosen by checking the wrapped
method name (e.g., wrapped_method_name.startswith("generate_content"),
.startswith("predict"), .startswith("send_message")); update the related
assertions in the disabled_test_bison.py and disabled_test_gemini.py fixtures to
expect the semconv provider token and the mapped operation values. Ensure all
references to the old hard-coded strings are replaced.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 209-210: The branch that handles a plain string (currently
appending {"role": "user", "content": contents}) is inconsistent with other
branches that JSON-encode an array of parts; update that branch so it
JSON-encodes the string into an array-of-parts before appending (e.g., use
json.dumps([...]) to produce the same array-of-parts shape used in the other
branches), ensuring the messages list always has content as a JSON-encoded
array; locate this change where the variable contents is handled and messages is
appended in span_utils.py and mirror the encoding logic used in the other
branches.

---

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py`:
- Around line 239-240: Replace the hard-coded
GenAIAttributes.GEN_AI_PROVIDER_NAME and GEN_AI_ATTRIBUTES.GEN_AI_OPERATION_NAME
values with the semconv-defined provider token for Vertex AI and a derived
operation mapping based on the wrapped method name: map methods starting with
"generate_content" -> operation "generate", "predict" -> "predict", and
"send_message" -> "chat" (use the exact OpenTelemetry GenAI semantic-convention
tokens for provider and operations). Implement this logic where
GenAIAttributes.GEN_AI_PROVIDER_NAME and GenAIAttributes.GEN_AI_OPERATION_NAME
are set (the block that currently sets "vertex_ai" and "chat") so provider uses
the canonical semconv constant and operation is chosen by checking the wrapped
method name (e.g., wrapped_method_name.startswith("generate_content"),
.startswith("predict"), .startswith("send_message")); update the related
assertions in the disabled_test_bison.py and disabled_test_gemini.py fixtures to
expect the semconv provider token and the mapped operation values. Ensure all
references to the old hard-coded strings are replaced.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 45b5433d-73a3-4460-bc3f-35d3bba6e3af

📥 Commits

Reviewing files that changed from the base of the PR and between 7c8d572 and f839c29.

⛔ Files ignored due to path filters (2)

packages/opentelemetry-instrumentation-google-generativeai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-vertexai/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
packages/opentelemetry-instrumentation-google-generativeai/pyproject.toml
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
packages/opentelemetry-instrumentation-vertexai/pyproject.toml
packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_bison.py
packages/opentelemetry-instrumentation-vertexai/tests/disabled_test_gemini.py
packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

✅ Files skipped from review due to trivial changes (4)

packages/opentelemetry-instrumentation-vertexai/pyproject.toml
packages/opentelemetry-instrumentation-google-generativeai/pyproject.toml
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/event_emitter.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/event_emitter.py

🚧 Files skipped from review as they are similar to previous changes (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/init.py
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

doronkopit5 · 2026-03-23T09:28:51Z

...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

-            GenAIAttributes.GEN_AI_SYSTEM: "Google",
-            SpanAttributes.LLM_REQUEST_TYPE: LLMRequestTypeValues.COMPLETION.value,
+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai",
+            GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",


use GenAiOperationNameValues.TEXT_COMPLETION.value instead of hardcoded value, and why change to chat form completion? a mistake from the beginning?

doronkopit5 · 2026-03-23T09:29:56Z

...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

        attributes={
-            GenAIAttributes.GEN_AI_SYSTEM: "Google",
-            SpanAttributes.LLM_REQUEST_TYPE: LLMRequestTypeValues.COMPLETION.value,
+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "gcp.gen_ai",


cant you import from GenAiSystemValues.<gcp>.value?

doronkopit5 · 2026-03-23T09:30:55Z

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

+GEN_AI_INPUT_MESSAGES = "gen_ai.input.messages"
+GEN_AI_OUTPUT_MESSAGES = "gen_ai.output.messages"


import from
GenAIAttributes.GEN_AI_INPUT_MESSAGES

doronkopit5 · 2026-03-23T09:37:18Z

packages/opentelemetry-instrumentation-google-generativeai/uv.lock

do you know why the revision changed and upload time added in many places?

doronkopit5 · 2026-03-23T09:38:08Z

...es/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py

+            GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai",
+            GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",


same comments from google genai

feat(gemini): migrate vertexai and google-generativeai to OTel GenAI …

f839c29

…semantic conventions

avivhalfon force-pushed the ah/gemini-sem-conv branch from 7c8d572 to f839c29 Compare March 22, 2026 15:37

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py Show resolved Hide resolved

fix(google-generativeai): remove unused SpanAttributes import

c13d0e2

doronkopit5 reviewed Mar 23, 2026

View reviewed changes

		GenAIAttributes.GEN_AI_PROVIDER_NAME: "vertex_ai",
		GenAIAttributes.GEN_AI_OPERATION_NAME: "chat",

		GEN_AI_INPUT_MESSAGES = "gen_ai.input.messages"
		GEN_AI_OUTPUT_MESSAGES = "gen_ai.output.messages"

Conversation

avivhalfon commented Mar 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

CLAassistant commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

gen_ai.input.messages (Input messages JSON schema)

gen_ai.output.messages (Output messages JSON schema)

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

gen_ai.operation.name predefined (well-known) values

gen_ai.provider.name well-known value for Vertex AI

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

doronkopit5 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

doronkopit5 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

doronkopit5 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

doronkopit5 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

doronkopit5 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

avivhalfon commented Mar 22, 2026 •

edited by coderabbitai bot

Loading

CLAassistant commented Mar 22, 2026 •

edited

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

`gen_ai.input.messages` (Input messages JSON schema)

`gen_ai.output.messages` (Output messages JSON schema)

`gen_ai.operation.name` predefined (well-known) values

`gen_ai.provider.name` well-known value for Vertex AI