Frontier Signal
FREIA: Unsupervised RL Enhances LLM Reasoning with Adaptive Rewards
FREIA, a new unsupervised reinforcement learning algorithm, improves LLM reasoning by adaptively balancing consensus and exploration, outperforming baselines in mathematical tasks.
Read the briefing