DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI · 2025
DeepSeek
The DeepSeek-R1 technical report. Demonstrates that pure RL on base models can induce long chain-of-thought reasoning without supervised reasoning data.