Chatterbox Multilingual TTS โ 8-bit Quantized
8-bit quantized port of ResembleAI's Chatterbox Multilingual TTS, reducing memory footprint while preserving the original 23-language coverage.
This repository contains only the quantized weights plus auxiliary text-processing files. Model weights, architecture, and training are entirely ResembleAI's work โ all credit for the underlying model goes to the Chatterbox team. This port adds only 8-bit quantization and bundles per-language text-processing helpers.
What's included
| File / Directory | Role |
|---|---|
model.safetensors |
8-bit quantized model weights (1.33 GB) |
config.json |
Model config |
tokenizer.json / tokenizer_config.json / vocab.txt |
Tokenizer |
Cangjie5_TC.json |
Traditional Chinese Cangjie input-method dictionary (Chinese text preprocessing) |
russian_stress_dict.json.gz |
Russian word-stress dictionary (stress mark insertion for better pronunciation) |
HebrewDiacritization.mlmodelc / .mlpackage |
Core ML model for adding nikud (Hebrew vowel marks) so Hebrew text renders pronounceably |
The auxiliary files cover languages where the written script doesn't fully specify pronunciation. Load them alongside the main model to get quality comparable to the float-precision original for Chinese, Russian, and Hebrew.
Languages
23 languages, matching the ResembleAI original: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish.
Model details
- Base parameter count: ~0.3B (matches ResembleAI original)
- Quantization: 8-bit weights
- Format: Safetensors (tensor dtypes: F32 for scales/biases, U32 for packed int8 weights)
- Features preserved from base: zero-shot voice cloning, emotion exaggeration, alignment-informed inference
Targets
Mixed: the main model + tokenizer artifacts are framework-agnostic
Safetensors, usable anywhere Chatterbox is. The bundled
HebrewDiacritization.mlmodelc / .mlpackage is a Core ML model intended
for on-device Apple platforms (iOS 17+, macOS 14+). If you're running on
another platform, swap in your preferred Hebrew nikud source.
Quantization
Quantized from the original float checkpoint. Accuracy vs. the float baseline depends on your workload โ run audio-quality A/B checks against the ResembleAI original if exact parity matters for your use case.
License & Attribution
This port inherits the MIT license from ResembleAI. See the original Chatterbox model card for terms.
The model weights, architecture, and training are ResembleAI's work. This repository provides only 8-bit quantization and bundled text-processing auxiliaries. Please cite and credit ResembleAI for any use of the underlying model.
Links
- Original model: https://huggingface.co/ResembleAI/chatterbox
- License: MIT
- Downloads last month
- 105
Model tree for smdesai/Chatterbox-Multilingual-TTS-8bit
Base model
ResembleAI/chatterbox