@mayank-mishra on Hugging Face: "Thrilled to unveil DS-MoE: a dense training and sparse inference scheme for…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

posted an update Apr 9, 2024

Post

2689

Thrilled to unveil DS-MoE: a dense training and sparse inference scheme for enhanced computational and memory efficiency in your MoE models! 🚀🚀🚀

Discover more in our blog: https://huggingface.co/blog/bpan/ds-moe and dive into the details with our paper: Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models (2404.05567)

Great writing!

In this post