LongCat-2.0
Meituan's LongCat-2.0 long-context language model for extended document understanding and generation.
Model Description
Model Introduction
We introduce LongCat-2.0, a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token — a substantial step up from previous LongCat models, accompanied by several architectural improvements.
Both the full training run and the large-scale deployment are built entirely on AI ASIC superpods. Pretraining spans millions of accelerator-hours across more than 35 trillion tokens, with no rollbacks or irrecoverable loss spikes — demonstrating that we have the capability to conduct frontier-scale training on alternative hardware platforms.
To strengthen the model on long-horizon tasks, we introduce LongCat Sparse Attention and train LongCat-2.0 on hundreds of billions of tokens of 1M-context data. Together with dedicated post-training, this gives LongCat-2.0 strong performance on coding and agentic tasks.
[!NOTE] 🏋️ Model weights coming soon — stay tuned!
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.