view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day 30 days ago • 48
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 63
view article Article Building Jobly: Semantic Job Matching with RAG and Vector Embeddings Nov 28, 2025 • 12
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs Paper • 2508.14896 • Published Aug 20, 2025 • 22
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22, 2025 • 63
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 • 206
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated Nov 14, 2025 • 162
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11, 2025 • 97