Instructions to use moondream/md3p-int4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use moondream/md3p-int4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir md3p-int4 moondream/md3p-int4
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
MD3 Preview - Int4 Quantized (MLX)
Pre-quantized version of Moondream 3 Preview for MLX inference.
Quantization Details
- MoE Experts: int4 affine quantization (bits=4, group_size=64)
- Other weights: bf16 (unchanged)
- Memory savings: ~60% reduction in MoE weight memory
Source
Quantized from moondream/moondream3-preview
- Downloads last month
- 359
Hardware compatibility
Log In to add your hardware
Quantized
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support