Skip to content

feat: preserve token IDs on messages#448

Open
Hecate0821 wants to merge 1 commit intomainfrom
chengxi/message-token-ids
Open

feat: preserve token IDs on messages#448
Hecate0821 wants to merge 1 commit intomainfrom
chengxi/message-token-ids

Conversation

@Hecate0821
Copy link
Copy Markdown

@Hecate0821 Hecate0821 commented Apr 28, 2026

Summary

Adds token-native trace support to EP messages without turning EP into an RL framework.

Current Design

eval_protocol.models.Message now has optional token_ids: list[int] | None. This lets rollout processors preserve engine token IDs alongside message content and provider logprob metadata.

The schema remains backward compatible:

  • Provider-specific logprobs dictionaries are still accepted unchanged.
  • Text-only messages still work.
  • A strict alignment check only applies to the clean token-native shape where both token_ids and flat list[float] logprobs are set. In that case, lengths must match.

SingleTurnRolloutProcessor now extracts token IDs from serialized provider logprobs when they are available:

  • logprobs.content[].token_id
  • logprobs.token_ids[]

Those IDs are stored on the assistant Message.token_ids. If the provider does not expose token IDs, the field remains unset.

Why

The cookbook async RL path is token-native only. Multi-turn RL cannot safely re-tokenize assistant text after rollout because BPE merges can cross turn boundaries and inference logprobs can become misaligned. EP needs a simple message-level carrier for token IDs so downstream training can consume traces without reconstructing tokens from text.

Tests / Checks

  • python3.11 -m pytest tests/test_rollout_logprobs.py tests/test_eval_protocol_import.py::TestRewardProtocolFunctionality::test_message_preserves_token_ids tests/test_eval_protocol_import.py::TestRewardProtocolFunctionality::test_message_rejects_misaligned_float_logprobs
  • python3.11 -m ruff check eval_protocol/models.py eval_protocol/pytest/default_single_turn_rollout_process.py tests/test_rollout_logprobs.py tests/test_eval_protocol_import.py
  • git diff --check

Note: full tests/test_eval_protocol_import.py currently has an unrelated failure in TestRewardProtocolImports.test_star_import_works because eval_protocol.adapters does not expose LangfuseAdapter.

Related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant