Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat(async-grpo): add sampling parameter parity
#5418 opened Mar 31, 2026 by kdubovikov Loading…
4 of 8 tasks
Delta weight sync using Xet buckets
#5417 opened Mar 31, 2026 by AmineDiro Draft
8 tasks
fix(async-grpo): honor model init dtype
#5416 opened Mar 31, 2026 by kdubovikov Loading…
3 of 8 tasks
Remove duplicated prepare_deepspeed
#5414 opened Mar 31, 2026 by albertvillanova Loading…
Skip redundant forward pass for on-policy vLLM importance sampling
#5413 opened Mar 31, 2026 by GJ98 Loading…
3 of 8 tasks
add JEPO trainer
#5411 opened Mar 31, 2026 by zbills Loading…
3 of 7 tasks
Add DistillationTrainer for efficient on-policy distillation
#5407 opened Mar 30, 2026 by cmpatino Loading…
3 of 5 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406 opened Mar 30, 2026 by BrownianNotion Loading…
5 of 8 tasks
Add per-sample tool filtering to GRPOTrainer via tools column
#5398 opened Mar 27, 2026 by lailanelkoussy Loading…
3 tasks done
Better test consistency RLOO vs GRPO
#5396 opened Mar 27, 2026 by qgallouedec Loading…
Add tool calling support to RLOOTrainer
#5395 opened Mar 27, 2026 by qgallouedec Loading…
Remove xfail for ZeRO 2 and 3 + SFT + PEFT test
#5383 opened Mar 27, 2026 by qgallouedec Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381 opened Mar 26, 2026 by matdou Loading…
2 of 5 tasks
Remove truncation_mode from DPO
#5372 opened Mar 25, 2026 by albertvillanova Loading…
add more generaic device suppport for CI tests
#5357 opened Mar 24, 2026 by kaixuanliu Loading…
Enable Tensor Parallelism in SFT script
#5331 opened Mar 21, 2026 by songhappy Loading…
(5/5) async grpo metrics
#5322 opened Mar 20, 2026 by AmineDiro Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.