
We introduce MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters activated per token.
On-demand DeploymentDocs | On-demand deployments allow you to use MiniMax-M1-80k on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |