MiniMax-M1
Scaling Reasoning with Hybrid Attention
The world's first open-weight model with a hybrid MoE architecture and lightning attention, supporting a 1 million token context for unparalleled long-form task mastery.
A New Architecture for a New Era of AI
MiniMax-M1's innovations deliver state-of-the-art performance with breakthrough efficiency.
Hybrid MoE Architecture
Combines a Mixture-of-Experts (MoE) design with 45.9B active parameters out of 456B total, optimizing compute for every token.
1M Token Context
Natively supports an unprecedented 1 million token context window—8 times larger than leading competitors—for deep document analysis.
Lightning Attention
A novel attention mechanism that dramatically reduces computational cost during inference, making long-context tasks feasible and fast.
Hyper-Efficient RL Training
Powered by the CISPO algorithm, M1's RL training finished in just 3 weeks, showcasing a new paradigm in training efficiency.
Leading Performance on Key Benchmarks
MiniMax-M1 demonstrates exceptional capabilities, particularly in complex, real-world tasks that demand deep reasoning and long-context understanding.
56.0%
SWE-bench
Outperforms leading open-weight models in complex, real-world software engineering tasks, showcasing superior problem-solving abilities.
73.4%
OpenAI-MRCR (128k)
Achieves top-tier results in Multi-hop Retrieval and Question Answering, proving its mastery over long-context information processing.
67.8%
TAU-bench (Retail)
Exhibits advanced agentic tool use capabilities, navigating complex scenarios to successfully complete tasks.
A Revolution in AI Economics & Efficiency
M1 was built not just for performance, but for a new paradigm of cost-effective, scalable AI development, directly challenging industry standards.

Breakthrough Cost-Performance
The entire RL training for M1 was completed in just 3 weeks using 512 H800 GPUs, at a leasing cost of only $537,400. This demonstrates a new level of efficiency in large model training, an order of magnitude less than expected.
Unmatched Inference Efficiency
Compared to its peers, M1 is drastically more efficient. When generating 100K tokens, it consumes just 25% of the computational resources required by the DeepSeek-R1 model, setting a new industry benchmark for large-scale inference.
Innovative CISPO Algorithm
Our novel CISPO reinforcement learning algorithm achieves twice the acceleration compared to contemporary methods like DAPO, reaching peak performance in half the training steps and further cementing our lead in efficiency.
Two Models, Tuned for Your Needs
Whether you need balanced performance or maximum reasoning power, there's an M1 for you.
MiniMax-M1 40K
The 40K thinking budget model offers a powerful and efficient baseline, representing an intermediate stage of M1's training. It excels at a wide range of tasks with remarkable speed.
Download ModelMiniMax-M1 80K
The fully-trained 80K thinking budget model provides maximum reasoning depth. It's built for tackling the most complex challenges in software engineering, tool use, and long-context understanding.
Download ModelTransparent, Open-Source Science
Our breakthroughs are detailed in our technical report. We believe in advancing the field through open collaboration and rigorous research.
Read the Paper on arXiv