Skip to content

Pull requests: THUDM/slime

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

pass critic role through to create RayTrainGroup
#1797 opened Apr 2, 2026 by znculee Loading…
Add rollout sampling-mask support
#1795 opened Apr 2, 2026 by yitianlian Loading…
[WIP] fix loss oom run-ci-megatron
#1788 opened Mar 31, 2026 by lilei199908 Loading…
Hook proposal
#1774 opened Mar 27, 2026 by andrija-s Draft
Add host memory metrics to available_memory function
#1764 opened Mar 25, 2026 by peterjc123 Loading…
[Fix] Initialize grad_norm before found_inf skip path
#1762 opened Mar 24, 2026 by kaysonyu Loading…
feat: add npu patch for qwen3-vl-8b grpo & ppo
#1750 opened Mar 23, 2026 by cjy0x Loading…
[docker] fix qwen3_vl visual module loading
#1727 opened Mar 15, 2026 by ZHZisZZ Loading…
Add Mooncake Backend for Rollout Data Transfer run-ci-megatron
#1709 opened Mar 11, 2026 by zxpdemonio Loading…
6 tasks done
fix: make ray actor gpu fractions configurable
#1699 opened Mar 10, 2026 by ailuntz Loading…
fix: accept unboxed math answers
#1698 opened Mar 10, 2026 by ailuntz Loading…
fix: default reward for aborted samples
#1697 opened Mar 10, 2026 by ailuntz Loading…
fix: handle missing sglang cuda-graph constant
#1696 opened Mar 10, 2026 by ailuntz Loading…
PipelineRL -- keep cache on weight update
#1694 opened Mar 9, 2026 by hari-hm Loading…
fix: normalize rewards per-group when sample counts are unequal
#1655 opened Mar 2, 2026 by dubin555 Loading…
2 of 3 tasks
feat: Add knowledge distillation example with offline support
#1654 opened Mar 2, 2026 by tourzhao Loading…
3 tasks
Refactor code safety checks by removing patterns
#1643 opened Feb 28, 2026 by Rohan5commit Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.