fix(geo3k-vlm-sft): remove --apply-chat-template from SFT launch script by DongzhuoranZhou · Pull Request #1791 · THUDM/slime

DongzhuoranZhou · 2026-04-01T08:54:56Z

Problem

--apply-chat-template in run_geo3k_vlm_sft.sh converts the messages list into a plain string before it reaches the SFT rollout. sft_rollout.generate_rollout() expects sample.prompt to be a list of message dicts, so passing a string causes a TypeError at runtime — training fails immediately.

Fix

Remove the flag. The messages column produced by prepare_sft_data.py is already a structured list; the SFT rollout consumes it directly to build the per-token loss mask.

--apply-chat-template flattens the messages list into a plain string, stripping the role boundaries that the SFT rollout needs to build the per-token loss mask. With the flag present, user/system tokens are included in the loss — the model trains on the wrong tokens. The sft_rollout.generate_rollout() expects messages as a structured list so it can identify assistant turns and mask everything else out. Removing the flag restores correct SFT loss masking behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

zhuzilin merged commit 887fba1 into THUDM:main Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(geo3k-vlm-sft): remove --apply-chat-template from SFT launch script#1791

fix(geo3k-vlm-sft): remove --apply-chat-template from SFT launch script#1791
zhuzilin merged 1 commit intoTHUDM:mainfrom
DongzhuoranZhou:feat/search-r1-geo3k-vlm-sft

DongzhuoranZhou commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DongzhuoranZhou commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DongzhuoranZhou commented Apr 1, 2026 •

edited

Loading