Running on CPU Upgrade Featured 2.97k The Smol Training Playbook 📚 2.97k The secrets to building world-class LLMs
Qwen/Qwen3-Next-80B-A3B-Thinking Text Generation • 81B • Updated Sep 15, 2025 • 75.6k • • 477
CohereLabs/command-a-reasoning-08-2025 Text Generation • 111B • Updated 27 days ago • 392 • • 128
Running 3.67k The Ultra-Scale Playbook 🌌 3.67k The ultimate guide to training LLM on large GPU Clusters
Deepseek Papers Collection Deepseek papers collection • 29 items • Updated about 20 hours ago • 320