verl

Here are 22 public repositories matching this topic...

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

machine-learning reinforcement-learning tinker distributed-training ml-infrastructure ml-platform agent-framework search-agent llm-training llm-reasoning agentic-workflow swe-agent verl coding-agent

Updated Apr 17, 2026
Python

TsinghuaC3I / MARTI

Star

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

camel llama gemma multi-agent-systems autogen multi-agent-reinforcement-learning large-language-models qwen large-reasoning-models deepseek-r1 verl openrlhf

Updated Apr 14, 2026
Python

NVlabs / GDPO

Star

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

rl reasoning trl llm agentic-ai grpo verl

Updated Feb 17, 2026
Python

thuml / RLVR-World

Star

Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934

text-game video-generation robotic-manipulation video-prediction web-agent real2sim world-model webarena video-gpt grpo verl rlvr reinforcement-learning-with-verifiable-rewards

Updated Oct 28, 2025
Python

GAIR-NLP / OctoThinker

Star

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

rl llama reasoning post-training pre-training llm qwen verl mid-training

Updated Jul 23, 2025
Jupyter Notebook

yjyddq / DARE

Star

Official repository of DARE: Diffusion Large Language Models Alignment and Reinforcement Executor

reinforcement-learning alignment rl verl diffusion-language-models diffusion-large-language-model dllm masked-diffusion-large-language-model block-diffusion-large-language-model dllm-infra dllm-rl-infra dllm-rl

Updated Apr 13, 2026
Python

Trae1ounG / BuPO

Star

[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

rl interpretability llms llm-reasoning verl

Updated Feb 6, 2026
Python

sylvain-wei / 24-Game-Reasoning

Star

超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1

alignment reasoning r1 post-training cot sft o1 24game llm rlhf deepseek r1-zero verl long-cot

Updated Apr 5, 2025
Python

josancamon19 / rl-scaling-laws

Star

qwen3-base family of models RL on gsm8k using verl, is there an RL power law on downstream tasks?

rl scaling-laws verl

Updated Oct 19, 2025
Python

bowen-upenn / PersonaMem-v2

Star

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

personalization reinfocement-learning llm long-context personalized-generation llm-memory reinforcement-finetuning grpo verl agentic-memory rlvr

Updated Apr 1, 2026
Python

Graph-Reasoner / Graph-R1

Star

Long COT RFT and Reinforcement Learning Creates Generalize

ai graph rl reasoning reasoning-language-models verl

Updated Aug 29, 2025
Python

zsychina / Curriculum-LLM

Star

Using automated curriculum learning to enhance LLM's RL training process.

reinforcement-learning curriculum-learning llm qwen verl

Updated Mar 25, 2025
Python

omron-sinicx / dgpo

Star

[ACL 2026 main] DGPO: Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities

reinforcement-learning pixi rag llm agentic-rag verl acl2026

Updated Apr 12, 2026
Python

Magnicord / llm-env-templates

Star

A list of uv environments templates for LLM development.

python environment deep-learning conda pytorch venv uv llm flash-attn verl openrlhf

Updated Sep 19, 2025

RunRiotComeOn / AVR

Star

AVR: Learning Adaptive Reasoning Paths for Efficient Visual Reasoning

reinforcement-learning vlm llama-factory reasoning-language-models verl overthinking

Updated Apr 12, 2026
Python

rabiloo / llm-finetuning

Star

Sample for Fine-Tuning LLMs & VLMs

transformers perf moe lora fine-tuning large-language-models llm rlhf qlora qwen llama-factory llama3 grpo verl

Updated Apr 3, 2025
Python

hung20gg / forecast-agent-training-scripts

Star

Training Script using VeRL for multi-turn GPRO w/ MCP tool-calling

mcp multi-turn verl

Updated Mar 22, 2026
Python

cognichip / Noisy-RL

Star

RLVεR: Reinforcement Learning with Verifiable Noisy Rewards

rl llm grpo verl

Updated Jan 9, 2026
Python

KDEGroup / SWE-AGILE

Star

[ACL 2026 Findings] SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context

agent acl llm-training llm-reasoning swe-agent verl coding-agent

Updated Apr 15, 2026
Python

awinml / verl-turing-support

Star

Fork of VeRL to support Turing Family of GPUs

turing verl

Updated Nov 2, 2025
Python

Improve this page

Add a description, image, and links to the verl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the verl topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

verl

Here are 22 public repositories matching this topic...

rllm-org / rllm

TsinghuaC3I / MARTI

NVlabs / GDPO

thuml / RLVR-World

GAIR-NLP / OctoThinker

yjyddq / DARE

Trae1ounG / BuPO

sylvain-wei / 24-Game-Reasoning

josancamon19 / rl-scaling-laws

bowen-upenn / PersonaMem-v2

Graph-Reasoner / Graph-R1

zsychina / Curriculum-LLM

omron-sinicx / dgpo

Magnicord / llm-env-templates

RunRiotComeOn / AVR

rabiloo / llm-finetuning

hung20gg / forecast-agent-training-scripts

cognichip / Noisy-RL

KDEGroup / SWE-AGILE

awinml / verl-turing-support

Improve this page

Add this topic to your repo