generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(async-grpo): add sampling parameter parity
#5418
opened Mar 31, 2026 by
kdubovikov
Loading…
4 of 8 tasks
fix(async-grpo): honor model init dtype
#5416
opened Mar 31, 2026 by
kdubovikov
Loading…
3 of 8 tasks
Skip redundant forward pass for on-policy vLLM importance sampling
#5413
opened Mar 31, 2026 by
GJ98
Loading…
3 of 8 tasks
Add
log_multimodal param to GRPOConfig and RLOOConfig to control image logging
#5408
opened Mar 30, 2026 by
apardyl
Loading…
3 of 8 tasks
Add
DistillationTrainer for efficient on-policy distillation
#5407
opened Mar 30, 2026 by
cmpatino
Loading…
3 of 5 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406
opened Mar 30, 2026 by
BrownianNotion
Loading…
5 of 8 tasks
Add per-sample tool filtering to GRPOTrainer via
tools column
#5398
opened Mar 27, 2026 by
lailanelkoussy
Loading…
3 tasks done
feat(grpo): add stop_tool_names for immediate agent loop termination
#5390
opened Mar 27, 2026 by
lailanelkoussy
Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381
opened Mar 26, 2026 by
matdou
Loading…
2 of 5 tasks
[vllm-serve] Add extra_llm_kwargs for passing additional arguments to vllm.LLM()
#5367
opened Mar 25, 2026 by
jonahsamost
Loading…
1 of 5 tasks
Add chunked LM head for memory-efficient log-prob computation for AsyncGRPOTrainer
#5349
opened Mar 23, 2026 by
AmineDiro
Loading…
[Test] Fix *test_training_vlm_multi_image* by skipping vision params in assertion
#5341
opened Mar 22, 2026 by
YangKai0616
Loading…
Fix Liger kernel crash with device_map="auto" on multi-GPU in GRPOTrainer
#5340
opened Mar 22, 2026 by
YangKai0616
Loading…
Support multimodal tool responses in
environment_factory for VLM training
#5323
opened Mar 20, 2026 by
sergiopaniego
Loading…
5 tasks
(4/5) async grpo break out of generation loop (is_done)
#5321
opened Mar 20, 2026 by
AmineDiro
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.