Reinforcement Learning in Simulink Example

23h

The Reinforcement Gap — or why some AI skills improve faster than others

AI tasks that work well with reinforcement learning are getting better fast — and threatening to leave the rest of the ...

GitHub

Train multi-step agents for real-world tasks using GRPO.

RULER (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the ...

IEEE

Observer-Based Multi-Agent Reinforcement Learning for Pursuit-Evasion Game With Multiple Unknown Uncertainties

Abstract: This paper aims to investigate the challenging problem of a multi-agent game with multiple pursuers and a single evader in an environment with multiple unknown uncertainties. A coupled ...

IEEE

Implementation of Deep Reinforcement Learning for Model-free Switching And Control of a 23-level Single DC Source Hybrid Packed U-Cell (HPUC)

Abstract: This paper proposes a novel Deep Reinforcement Learning (DRL) method for controlling a 23-level Single DC Source Hybrid Packed U-Cell (HPUC) converter. The HPUC topology generates a high ...

GitHub

Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models

We propose TraceRL, a trajectory-aware reinforcement learning method for diffusion language models, which demonstrates the best performance among RL approaches for DLMs. We also introduce a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results