Skip to content

feat(langchain) GenAI semconv compliance#3836

Closed
max-deygin-traceloop wants to merge 7 commits intomainfrom
max/tlp-1927-langchain-instrumentation
Closed

feat(langchain) GenAI semconv compliance#3836
max-deygin-traceloop wants to merge 7 commits intomainfrom
max/tlp-1927-langchain-instrumentation

Conversation

@max-deygin-traceloop
Copy link
Copy Markdown
Contributor

@max-deygin-traceloop max-deygin-traceloop commented Mar 22, 2026

Migrate LangChain instrumentation to OTel GenAI semconv

Aligns the LangChain instrumentation with the upstream OpenTelemetry GenAI semantic conventions.

Changes:

  • Replace all SpanAttributes.LLM_* constants with upstream GenAIAttributes.GEN_AI_* equivalents
  • Migrate flat gen_ai.prompt.{i}.* / gen_ai.completion.{i}.* attributes to gen_ai.input.messages / gen_ai.output.messages JSON arrays (set_llm_request, set_chat_request, set_chat_response)
  • Migrate flat gen_ai.tool.definitions.{i}.* attributes to gen_ai.tool.definitions JSON array
  • Replace _set_chat_tool_calls (flat span attrs) with _build_tool_calls_list (returns list for embedding in JSON)
  • Fix double-set gen_ai.operation.name bug in callback_handler.py
  • Fix vendor detection casing: "Anthropic""anthropic" to match OTel semconv
  • Update all affected tests to parse gen_ai.input.messages / gen_ai.output.messages JSON arrays

Note: This is a recreation of #3822, which was auto-closed when its target branch was merged into main. No code changes were made.

Summary by CodeRabbit

  • Refactor

    • Consolidated telemetry into structured JSON attributes for tools, input messages, and output messages; replaced legacy per-index attributes with unified fields.
    • Switched span usage and operation naming to GenAI semantic attributes; moved provider attribution to a dedicated provider-name attribute and updated metric labels.
  • Chores

    • Bumped semantic-conventions dependency range to support GenAI attribute changes.
  • Tests

    • Updated tests to validate JSON-structured attributes, provider-name attribution, and normalized vendor identifiers.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

Consolidates per-index span attributes into JSON-serialized consolidated attributes for inputs, outputs, and tool definitions; switches usage and operation attribute keys to GenAI semantic conventions; normalizes vendor identifiers to lowercase/structured values; updates provider attribute usage and bumps semconv dependency.

Changes

Cohort / File(s) Summary
Core instrumentation
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py, packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py, packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py
Replaced indexed/per-field span attributes with consolidated JSON attributes: GEN_AI_INPUT_MESSAGES, GEN_AI_OUTPUT_MESSAGES, GEN_AI_TOOL_DEFINITIONS. Switched request/operation naming to GEN_AI_OPERATION_NAME and usage keys to GEN_AI_USAGE_*. Normalized vendor detection return values (title-case → lowercase/structured) and default from Langchainlangchain.
Event & utility attributes
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py, packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/utils.py
Replaced emitted/log attribute key for provider from GEN_AI_SYSTEM to GEN_AI_PROVIDER_NAME with value "langchain".
Dependency
packages/opentelemetry-instrumentation-langchain/pyproject.toml
Bumped opentelemetry-semantic-conventions-ai constraint from >=0.4.16,<0.5.0 to >=0.5.0,<0.6.0.
Tests — assertions & parsing updated
packages/opentelemetry-instrumentation-langchain/tests/* (e.g., test_chains.py, test_generation_role_extraction.py, test_langgraph.py, test_llms.py, test_structured_output.py, test_tool_call_content.py, test_tool_calls.py, tests/metrics/test_langchain_metrics.py, test_agents.py, test_documents_chains.py, test_lcel.py)
Updated tests to parse and assert consolidated JSON attributes (GEN_AI_INPUT_MESSAGES, GEN_AI_OUTPUT_MESSAGES, GEN_AI_TOOL_DEFINITIONS) instead of per-index attributes; updated provider assertions to GEN_AI_PROVIDER_NAME (value "langchain" or normalized provider names); updated usage attribute checks to GEN_AI_USAGE_*.
Tests — new files
packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py, packages/opentelemetry-instrumentation-langchain/tests/test_semconv_compliance.py
Added vendor-detection parametrized tests validating normalized identifiers and a semconv compliance test re-exporting upstream semconv_ai test cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I bundled prompts and answers tight,
Into JSON burrows, neat and light,
Vendors lowercased, attributes aligned,
Spans now hop with meaning defined,
A tiny rabbit stamps — all set, just right!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(langchain) GenAI semconv compliance' clearly and concisely summarizes the main change: migrating the LangChain instrumentation to align with OpenTelemetry GenAI semantic conventions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch max/tlp-1927-langchain-instrumentation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (1)

108-112: ⚠️ Potential issue | 🟡 Minor

Docstring is inconsistent with implementation.

The docstring on line 109 states defaults to "Langchain" but the actual return value on lines 112 and 120 is "langchain" (lowercase). Update the docstring to match the implementation.

     Returns:
-        Vendor string, defaults to "Langchain" if no match found
+        Vendor string, defaults to "langchain" if no match found
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`
around lines 108 - 112, Update the docstring for the function that returns the
vendor string (the code checking `if not class_name: return "langchain"`) so the
default vendor shown in the docstring matches the implementation: change
"Langchain" to "langchain" (lowercase) and ensure any other descriptive text
refers to the lowercase value returned when `class_name` is falsy.
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (1)

356-366: ⚠️ Potential issue | 🟡 Minor

Handle string arguments in OpenAI-format tool calls to prevent double-encoding.

When _build_tool_calls_list receives tool calls from additional_kwargs (the fallback path), the arguments field comes as a JSON string from OpenAI API responses. Calling json.dumps() on a string causes double-encoding, producing escaped quotes like "{\\"location\\":\\"San Francisco\\"}\".

The fallback path at line 156-162 and 221-228 reaches additional_kwargs data when messages lack a native tool_calls attribute. In this case, tool_args is already a JSON string and should not be serialized again.

🔧 Proposed fix
         if tool_args is not None:
-            call_obj["arguments"] = json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
+            if isinstance(tool_args, str):
+                call_obj["arguments"] = tool_args
+            else:
+                call_obj["arguments"] = json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 356 - 366, The code in _build_tool_calls_list currently always
json.dumps the tool_args into call_obj["arguments"], which double-encodes when
tool_args is already a JSON string (from additional_kwargs/OpenAI responses);
update the logic so that if tool_args is an instance of str you assign it
directly to call_obj["arguments"], otherwise serialize with json.dumps(...,
cls=CallbackFilteredJSONEncoder). Ensure you reference the existing tool_args
variable and the call_obj["arguments"] assignment so the change is localized and
preserves the encoder for non-string argument values.
🧹 Nitpick comments (13)
packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py (11)

817-817: Remove debug print statement.

Debug print statements should be removed from test code before merging.

🧹 Proposed fix
     result = model_with_tools.invoke(messages)
-    print(result.model_dump())
     spans = span_exporter.get_finished_spans()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py` at
line 817, Remove the leftover debug print by deleting the call to
print(result.model_dump()) in the test; locate the occurrence that prints the
'result' object's model_dump (search for "print(result.model_dump())" or
references to result.model_dump within tests/test_tool_calls.py) and remove that
statement so tests do not emit debug output.

637-639: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_anthropic_text_block_and_history(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 637 - 639, The test in
packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py has
duplicate `@pytest.mark.skip` decorators on the same test (the two identical
`@pytest.mark.skip`(reason="Direct model invocations do not create langchain
spans") lines); remove the redundant decorator so only a single
`@pytest.mark.skip` remains and keep the `@pytest.mark.vcr` decorator intact; locate
the duplicate decorators above the test function and delete the extra one.

1100-1102: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_parallel_tool_calls_with_events_with_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 1100 - 1102, The test has two identical decorators
"@pytest.mark.skip(reason="Direct model invocations do not create langchain
spans")" applied before the test (alongside "@pytest.mark.vcr"); remove the
duplicate skip decorator so only a single "@pytest.mark.skip(...)" remains and
leave "@pytest.mark.vcr" intact to avoid redundant annotations on the test
definition.

266-268: Remove duplicate @pytest.mark.skip decorators.

Multiple identical skip decorators on the same test are redundant.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_with_history_with_events_with_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 266 - 268, The test currently has two identical decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied before the test (see the decorated test in
tests/test_tool_calls.py); remove the duplicate so only one skip decorator
remains (keep a single "@pytest.mark.skip(...)" and leave the other decorators
such as "@pytest.mark.vcr" untouched) to eliminate redundant decorators on that
test.

1163-1165: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_parallel_tool_calls_with_events_with_no_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 1163 - 1165, The test has two identical decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied back-to-back; remove the duplicate so only one skip decorator
remains above the test (locate the duplicate decorators in
packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py around
the test marked with `@pytest.mark.vcr`) to leave a single
"@pytest.mark.skip(...)" immediately above the test.

968-968: Remove debug print statement.

🧹 Proposed fix
     result = model_with_tools.invoke(messages)
-    print(result.model_dump())
     spans = span_exporter.get_finished_spans()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py` at
line 968, Remove the stray debug print by deleting the line that calls
print(result.model_dump()) in the test (remove the debug output for the variable
result and its model_dump() call); if the test needs to assert state instead of
printing, replace it with an appropriate assertion on result (e.g., assert
something about result.model_dump()) or use logger.debug instead of print.

768-771: Remove duplicate @pytest.mark.skip decorators.

Three identical skip decorators here.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_anthropic_text_block_and_history_with_events_with_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 768 - 771, Remove the duplicate pytest.skip decorators on the test
in packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py:
there are three identical `@pytest.mark.skip`(...) lines; keep a single
`@pytest.mark.skip`(reason="Direct model invocations do not create langchain
spans") and delete the other two so only one skip decorator decorates the test
function.

526-528: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_anthropic_text_block_with_events_with_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 526 - 528, The test has duplicate decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied twice; remove the redundant decorator so the test has a single
`@pytest.mark.skip` (and retain the existing `@pytest.mark.vcr`) to avoid duplicate
markers; update the test declaration that contains these decorators accordingly.

919-922: Remove duplicate @pytest.mark.skip decorators.

Three identical skip decorators here.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_anthropic_text_block_and_history_with_events_with_no_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 919 - 922, Remove the duplicate pytest.skip decorators: there are
three identical `@pytest.mark.skip`(reason="Direct model invocations do not create
langchain spans") stacked before the test; keep a single `@pytest.mark.skip`(...)
and retain the existing `@pytest.mark.vcr` decorator. Update the decorators above
the test function so only one skip decorator remains followed by
`@pytest.mark.vcr`.

586-588: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_anthropic_text_block_with_events_with_no_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 586 - 588, Remove the duplicated decorator line
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" so the test decorated with the skip marker only has one
`@pytest.mark.skip` and the `@pytest.mark.vcr` decorator remains; locate the
duplicate decorator lines in the test function in tests/test_tool_calls.py and
delete the redundant one to avoid repeated markers.

365-367: Remove duplicate @pytest.mark.skip decorators.

🧹 Proposed fix
 `@pytest.mark.skip`(reason="Direct model invocations do not create langchain spans")
-@pytest.mark.skip(reason="Direct model invocations do not create langchain spans")
 `@pytest.mark.vcr`
 def test_tool_calls_with_history_with_events_with_no_content(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 365 - 367, Remove the duplicate decorator instance of
`@pytest.mark.skip` on the test that currently has two identical `@pytest.mark.skip`
lines above the `@pytest.mark.vcr` decorator; leave a single `@pytest.mark.skip`
(and keep `@pytest.mark.vcr`) so the test is skipped only once and the decorator
ordering remains correct.
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (2)

215-218: Potential inconsistency: function_call tool format differs from tool_calls format.

The legacy function_call handling creates a tool_call object with name and arguments but no id, while _build_tool_calls_list produces objects with id, name, and arguments. This inconsistency may cause downstream consumers to handle the two formats differently.

Consider aligning the structure, e.g., by adding a placeholder or empty id field:

🔧 Proposed alignment
                 if generation.message.additional_kwargs.get("function_call"):
                     fc = generation.message.additional_kwargs.get("function_call")
                     msg_obj["role"] = "assistant"
-                    msg_obj["tool_calls"] = [{"name": fc.get("name"), "arguments": fc.get("arguments")}]
+                    msg_obj["tool_calls"] = [{"id": "", "name": fc.get("name"), "arguments": fc.get("arguments")}]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 215 - 218, The legacy function_call handling branch (when
generation.message.additional_kwargs.get("function_call") is truthy) builds a
tool call as {"name", "arguments"} which is inconsistent with
_build_tool_calls_list's objects that include "id", "name", and "arguments";
update the branch that sets msg_obj["tool_calls"] so it includes an "id" key
(e.g., empty string or None) alongside "name" and "arguments" when constructing
fc from generation.message.additional_kwargs.get("function_call") to match the
shape produced by _build_tool_calls_list and avoid downstream format mismatches.

277-279: Silent exception swallowing may hide issues.

Catching all exceptions and passing silently could mask bugs in usage metadata processing. Consider logging at debug level for observability.

🔧 Proposed improvement
+    import logging
+    logger = logging.getLogger(__name__)
+
     except Exception:
         # If there's any issue processing usage metadata, continue without it
-        pass
+        logger.debug("Failed to process usage metadata", exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 277 - 279, The except Exception: pass in span_utils.py silently
swallows errors during usage metadata processing; replace the silent pass with a
debug-level log that records the exception (use logging.getLogger(__name__) or
the module's logger) and include exception details (exc_info=True or
logger.debug(..., exc_info=err)) so failures in the usage metadata processing
are observable while still allowing execution to continue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/opentelemetry-instrumentation-langchain/tests/test_langgraph.py`:
- Line 79: The test asserts the wrong attribute key: update the assertion that
currently checks
openai_span.attributes[SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS] to instead use
the OpenAI instrumentation's key SpanAttributes.LLM_USAGE_TOTAL_TOKENS (i.e.,
assert openai_span.attributes[SpanAttributes.LLM_USAGE_TOTAL_TOKENS] == 35),
keeping the rest of the test and the openai_span variable unchanged;
alternatively, if you intend to standardize on GEN_AI_USAGE_TOTAL_TOKENS, change
the OpenAI instrumentation export to set
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, but the quick fix is to update the
test to reference SpanAttributes.LLM_USAGE_TOTAL_TOKENS.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 356-366: The code in _build_tool_calls_list currently always
json.dumps the tool_args into call_obj["arguments"], which double-encodes when
tool_args is already a JSON string (from additional_kwargs/OpenAI responses);
update the logic so that if tool_args is an instance of str you assign it
directly to call_obj["arguments"], otherwise serialize with json.dumps(...,
cls=CallbackFilteredJSONEncoder). Ensure you reference the existing tool_args
variable and the call_obj["arguments"] assignment so the change is localized and
preserves the encoder for non-string argument values.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`:
- Around line 108-112: Update the docstring for the function that returns the
vendor string (the code checking `if not class_name: return "langchain"`) so the
default vendor shown in the docstring matches the implementation: change
"Langchain" to "langchain" (lowercase) and ensure any other descriptive text
refers to the lowercase value returned when `class_name` is falsy.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 215-218: The legacy function_call handling branch (when
generation.message.additional_kwargs.get("function_call") is truthy) builds a
tool call as {"name", "arguments"} which is inconsistent with
_build_tool_calls_list's objects that include "id", "name", and "arguments";
update the branch that sets msg_obj["tool_calls"] so it includes an "id" key
(e.g., empty string or None) alongside "name" and "arguments" when constructing
fc from generation.message.additional_kwargs.get("function_call") to match the
shape produced by _build_tool_calls_list and avoid downstream format mismatches.
- Around line 277-279: The except Exception: pass in span_utils.py silently
swallows errors during usage metadata processing; replace the silent pass with a
debug-level log that records the exception (use logging.getLogger(__name__) or
the module's logger) and include exception details (exc_info=True or
logger.debug(..., exc_info=err)) so failures in the usage metadata processing
are observable while still allowing execution to continue.

In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`:
- Line 817: Remove the leftover debug print by deleting the call to
print(result.model_dump()) in the test; locate the occurrence that prints the
'result' object's model_dump (search for "print(result.model_dump())" or
references to result.model_dump within tests/test_tool_calls.py) and remove that
statement so tests do not emit debug output.
- Around line 637-639: The test in
packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py has
duplicate `@pytest.mark.skip` decorators on the same test (the two identical
`@pytest.mark.skip`(reason="Direct model invocations do not create langchain
spans") lines); remove the redundant decorator so only a single
`@pytest.mark.skip` remains and keep the `@pytest.mark.vcr` decorator intact; locate
the duplicate decorators above the test function and delete the extra one.
- Around line 1100-1102: The test has two identical decorators
"@pytest.mark.skip(reason="Direct model invocations do not create langchain
spans")" applied before the test (alongside "@pytest.mark.vcr"); remove the
duplicate skip decorator so only a single "@pytest.mark.skip(...)" remains and
leave "@pytest.mark.vcr" intact to avoid redundant annotations on the test
definition.
- Around line 266-268: The test currently has two identical decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied before the test (see the decorated test in
tests/test_tool_calls.py); remove the duplicate so only one skip decorator
remains (keep a single "@pytest.mark.skip(...)" and leave the other decorators
such as "@pytest.mark.vcr" untouched) to eliminate redundant decorators on that
test.
- Around line 1163-1165: The test has two identical decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied back-to-back; remove the duplicate so only one skip decorator
remains above the test (locate the duplicate decorators in
packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py around
the test marked with `@pytest.mark.vcr`) to leave a single
"@pytest.mark.skip(...)" immediately above the test.
- Line 968: Remove the stray debug print by deleting the line that calls
print(result.model_dump()) in the test (remove the debug output for the variable
result and its model_dump() call); if the test needs to assert state instead of
printing, replace it with an appropriate assertion on result (e.g., assert
something about result.model_dump()) or use logger.debug instead of print.
- Around line 768-771: Remove the duplicate pytest.skip decorators on the test
in packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py:
there are three identical `@pytest.mark.skip`(...) lines; keep a single
`@pytest.mark.skip`(reason="Direct model invocations do not create langchain
spans") and delete the other two so only one skip decorator decorates the test
function.
- Around line 526-528: The test has duplicate decorators
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" applied twice; remove the redundant decorator so the test has a single
`@pytest.mark.skip` (and retain the existing `@pytest.mark.vcr`) to avoid duplicate
markers; update the test declaration that contains these decorators accordingly.
- Around line 919-922: Remove the duplicate pytest.skip decorators: there are
three identical `@pytest.mark.skip`(reason="Direct model invocations do not create
langchain spans") stacked before the test; keep a single `@pytest.mark.skip`(...)
and retain the existing `@pytest.mark.vcr` decorator. Update the decorators above
the test function so only one skip decorator remains followed by
`@pytest.mark.vcr`.
- Around line 586-588: Remove the duplicated decorator line
"@pytest.mark.skip(reason=\"Direct model invocations do not create langchain
spans\")" so the test decorated with the skip marker only has one
`@pytest.mark.skip` and the `@pytest.mark.vcr` decorator remains; locate the
duplicate decorator lines in the test function in tests/test_tool_calls.py and
delete the redundant one to avoid repeated markers.
- Around line 365-367: Remove the duplicate decorator instance of
`@pytest.mark.skip` on the test that currently has two identical `@pytest.mark.skip`
lines above the `@pytest.mark.vcr` decorator; leave a single `@pytest.mark.skip`
(and keep `@pytest.mark.vcr`) so the test is skipped only once and the decorator
ordering remains correct.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1b19291b-7a01-44d0-bf3d-88614c86335b

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2418b and 71a5029.

📒 Files selected for processing (13)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py
  • packages/opentelemetry-instrumentation-langchain/pyproject.toml
  • packages/opentelemetry-instrumentation-langchain/tests/test_chains.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_generation_role_extraction.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_langgraph.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_llms.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_semconv_compliance.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_structured_output.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_call_content.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have changes also in the uv.lock , no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just figured out, langchain instrumentation tests depends on bedrock, which depends on semconv <0.5.0, which conflicts with langchain` dependency on semconv >= 0.5.0...

Need to publish a bedrock and open-ai versions supporting 0.5.0 first

max-deygin-traceloop and others added 7 commits March 24, 2026 16:53
…→ GEN_AI_

- Replace SpanAttributes.LLM_REQUEST_FUNCTIONS with GenAIAttributes.GEN_AI_TOOL_DEFINITIONS
- Replace SpanAttributes.LLM_REQUEST_TYPE with GenAIAttributes.GEN_AI_OPERATION_NAME
- Replace SpanAttributes.LLM_USAGE_TOTAL_TOKENS with SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS
- Update test_llms.py, test_tool_calls.py, test_langgraph.py accordingly
- Add test_semconv_compliance.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nitions to JSON array

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace flat gen_ai.prompt.{i}.* attributes with GEN_AI_INPUT_MESSAGES JSON array
- Replace flat gen_ai.completion.{i}.* attributes with GEN_AI_OUTPUT_MESSAGES JSON array
- Convert _set_chat_tool_calls (flat span attrs) to _build_tool_calls_list (returns list)
- Fix vendor_detection: "Anthropic" → "anthropic" to match OTel semconv lowercase
- Update all affected tests to parse GEN_AI_INPUT/OUTPUT_MESSAGES JSON arrays

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…emove debug print

- vendor_detection.py: use canonical GenAISystem string values for all vendors
  (AWS → aws.bedrock, HuggingFace → hugging_face, Google → google, Azure → az.ai.openai,
   Cohere → cohere, Ollama → ollama, Together → together_ai, Replicate → replicate,
   Fireworks → fireworks, Groq → groq, MistralAI → mistral_ai)
- span_utils.py: remove debug print() from production exception handler
- tests: update GEN_AI_SYSTEM assertions to match new lowercase values

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ck, add vendor unit tests

- vendor_detection.py: Google/Vertex AI → "gcp.gen_ai" (matches GenAISystem.GOOGLE)
- vendor_detection.py: both fallback returns "Langchain" → "langchain" (matches GenAISystem.LANGCHAIN)
- callback_handler.py: two GEN_AI_SYSTEM default fallbacks "Langchain" → "langchain"
- span_utils.py: GEN_AI_SYSTEM default fallback "Langchain" → "langchain"
- tests: add parameterized test_vendor_detection.py covering all 13 vendors + default fallback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@max-deygin-traceloop max-deygin-traceloop force-pushed the max/tlp-1927-langchain-instrumentation branch from c5d49f5 to 15bf628 Compare March 24, 2026 14:57
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`:
- Around line 109-112: Docstring for the vendor detection fallback is
inconsistent with the actual return value: the docstring says the default is
"Langchain" but the function in vendor_detection.py returns "langchain". Update
either the docstring or the return value in the function (the code block that
checks `if not class_name: return "langchain"`); prefer making the docstring
match the code by changing the docstring to state the default is "langchain"
(lowercase), or if you need a capitalized brand, change the return to
`"Langchain"` so the docstring and the function (`vendor_detection` fallback)
are consistent.

In
`@packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py`:
- Around line 56-59: The type annotation for detect_vendor_from_class is
incorrect: update its signature from class_name: str to class_name:
Optional[str] and import Optional from typing so the annotation matches the
implementation that handles None (see detect_vendor_from_class and its use of if
not class_name:), ensuring tests that pass None remain valid.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06c459af-a68c-4bfe-9a8a-d99ae96f40ac

📥 Commits

Reviewing files that changed from the base of the PR and between c5d49f5 and 15bf628.

📒 Files selected for processing (19)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/utils.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py
  • packages/opentelemetry-instrumentation-langchain/pyproject.toml
  • packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_agents.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_chains.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_documents_chains.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_generation_role_extraction.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_langgraph.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_llms.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_semconv_compliance.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_structured_output.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_call_content.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py
✅ Files skipped from review due to trivial changes (3)
  • packages/opentelemetry-instrumentation-langchain/pyproject.toml
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/utils.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_semconv_compliance.py
🚧 Files skipped from review as they are similar to previous changes (13)
  • packages/opentelemetry-instrumentation-langchain/tests/test_agents.py
  • packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_documents_chains.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_structured_output.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_langgraph.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_generation_role_extraction.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_tool_call_content.py
  • packages/opentelemetry-instrumentation-langchain/tests/test_chains.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py

Comment on lines 109 to +112
Vendor string, defaults to "Langchain" if no match found
"""
if not class_name:
return "Langchain"
return "langchain"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring is inconsistent with the actual return value.

The docstring states defaults to "Langchain" but the actual return value is "langchain" (lowercase).

📝 Suggested fix
     Returns:
-        Vendor string, defaults to "Langchain" if no match found
+        Vendor string, defaults to "langchain" if no match found
     """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Vendor string, defaults to "Langchain" if no match found
"""
if not class_name:
return "Langchain"
return "langchain"
Vendor string, defaults to "langchain" if no match found
"""
if not class_name:
return "langchain"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`
around lines 109 - 112, Docstring for the vendor detection fallback is
inconsistent with the actual return value: the docstring says the default is
"Langchain" but the function in vendor_detection.py returns "langchain". Update
either the docstring or the return value in the function (the code block that
checks `if not class_name: return "langchain"`); prefer making the docstring
match the code by changing the docstring to state the default is "langchain"
(lowercase), or if you need a capitalized brand, change the return to
`"Langchain"` so the docstring and the function (`vendor_detection` fallback)
are consistent.

Comment on lines +56 to +59
(None, "langchain"),
])
def test_detect_vendor_from_class(class_name, expected):
assert detect_vendor_from_class(class_name) == expected
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the function signature in vendor_detection.py
rg -n "def detect_vendor_from_class" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py

Repository: traceloop/openllmetry

Length of output: 122


🏁 Script executed:

# View the function implementation
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py | head -120 | tail -25

Repository: traceloop/openllmetry

Length of output: 837


🏁 Script executed:

# View the complete test file to see all parametrization cases
cat -n packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py

Repository: traceloop/openllmetry

Length of output: 2518


Type annotation does not match actual behavior: None is accepted but signature says str.

The function signature specifies class_name: str, but the implementation explicitly handles None via if not class_name: (line 111) and the test correctly validates this behavior with (None, "langchain"). Update the function signature to Optional[str] to align the type annotation with the actual implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/tests/test_vendor_detection.py`
around lines 56 - 59, The type annotation for detect_vendor_from_class is
incorrect: update its signature from class_name: str to class_name:
Optional[str] and import Optional from typing so the annotation matches the
implementation that handles None (see detect_vendor_from_class and its use of if
not class_name:), ensuring tests that pass None remain valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants