Skip to content

fix(tracing): migrate RemoteRolloutProcessor to /traces endpoint#444

Closed
xzrderek wants to merge 0 commit intomainfrom
fix/traces-endpoint-migration
Closed

fix(tracing): migrate RemoteRolloutProcessor to /traces endpoint#444
xzrderek wants to merge 0 commit intomainfrom
fix/traces-endpoint-migration

Conversation

@xzrderek
Copy link
Copy Markdown
Contributor

@xzrderek xzrderek commented Apr 21, 2026

Summary

The /v1/traces/pointwise endpoint on the Fireworks tracing proxy was removed in favor of a unified /v1/traces endpoint. This adapts FireworksTracingAdapter (and its default data loader) to the new contract so RemoteRolloutProcessor keeps working.

What changed

Wire changes on the adapter (eval_protocol/adapters/fireworks_tracing.py)

  • URL: /v1/traces/pointwise/v1/traces (both the default path and the project_id-scoped variant).
  • Query params: the old endpoint took tags=rollout_id:<id> as the scoping mechanism; the new one expects rollout_id as a top-level query parameter. get_evaluation_rows still accepts the existing tags=[...] kwarg so callers don't have to change — it just pulls the rollout_id:<id> entry out and promotes it to a dedicated param, and raises ValueError if none is supplied.
  • Response shape: the new endpoint returns flat row dicts with PascalCase keys (Input, Output, Tags, InsertionId) instead of the old nested snake-case shape that had an observations[] array. The converter now reads the new keys directly and drops the "fall back to last GENERATION observation" branch, which has no equivalent on the new endpoint.
  • session_data["langfuse_trace_id"] is now sourced from InsertionId (the per-LLM-call id on the new row model). The key name is kept so downstream consumers that read it keep working.

Default data loader (eval_protocol/pytest/tracing_utils.py)

  • Passes limit=1 instead of max_retries=5. update_row_with_remote_trace only consumes one row and raises if multiple come back, so fetching more is just wasted bytes. max_retries=5 was a no-op retry knob specific to the old polling path, which the new endpoint doesn't expose.

Change Overview

flowchart LR
    caller[RemoteRolloutProcessor] --> loader[default_fireworks_output_data_loader]
    loader --> adapter[FireworksTracingAdapter.get_evaluation_rows]
    adapter -- old: tags=rollout_id:X --> old[[/v1/traces/pointwise<br/>snake_case + observations[]]]
    adapter -- new: rollout_id=X, limit=1 --> new[[/v1/traces<br/>PascalCase flat rows]]
    new --> conv[convert_trace_dict_to_evaluation_row<br/>reads Input/Output/Tags/InsertionId]
    conv --> row[EvaluationRow]

    style old stroke-dasharray: 4 2,opacity:0.4
Loading

Testing

  • Existing unit tests that exercise the adapter keep passing (no changes to the callable signatures).
  • End-to-end validation was done against a local gateway pointed at the new endpoint: fire a /chat/completions with a fresh rollout id, then call adapter.get_evaluation_rows(tags=[\"rollout_id:<id>\"], limit=1) and confirm the returned EvaluationRow has the expected messages + execution_metadata.rollout_id.
  • Pre-commit hooks (ruff format, ruff lint, basedpyright) all pass locally.

Risk / Compatibility

  • Caller API unchanged. Anything that was passing tags=[\"rollout_id:<id>\"] into get_evaluation_rows keeps working; the adapter internally rewrites to the new query param.
  • Hard failure for callers with no rollout id tag. Previously you could pass arbitrary tag filters ([\"production\"], etc.) and the server would honor them. The new endpoint is rollout-scoped, so the adapter now raises ValueError if tags doesn't include a rollout_id:<id> entry. I don't think there are any callers relying on the generic-tag path — update_row_with_remote_trace is the only downstream consumer in this SDK and it always scopes by rollout — but flagging it here so it's explicit.
  • Drops the GENERATION-observation fallback, which matters only if you were calling get_evaluation_rows against a legacy tracing backend that still returned that shape. All supported tracing.fireworks.ai deployments have moved to the new endpoint.

Made with Cursor


Note

Low Risk
Low risk because the diff as provided is empty and does not modify runtime behavior. If this is unintended, the main risk is missing the expected tracing endpoint migration.

Overview
No changes are present in the supplied diff (+++ /dev/null only), so this PR appears to be a no-op.

If an endpoint migration was intended, reviewers should re-check that the actual diff includes the expected updates (e.g., FireworksTracingAdapter.get_evaluation_rows and the default tracing data loader).

Reviewed by Cursor Bugbot for commit 0655f89. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit aa476f0. Configure here.

Comment thread eval_protocol/adapters/fireworks_tracing.py
@xzrderek xzrderek closed this Apr 21, 2026
@xzrderek xzrderek force-pushed the fix/traces-endpoint-migration branch from 0c2f7db to 0655f89 Compare April 21, 2026 04:53
@xzrderek
Copy link
Copy Markdown
Contributor Author

Closing in favor of a server-side compatibility shim. Plan: update the tracing gateway so both /traces and /traces/pointwise accept the legacy request shape (tags list + passthrough params) and emit snake_case responses; same for /logs. That way pinned customer SDK versions (figma/cosine/macroscope) keep working without an SDK update. Server-side change to follow in a fireworks PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant