Stable Diffusion 2.0 Inpainting - CoreML
Warning: This model has known quality issues. For production use, we recommend Realistic Vision Inpainting instead.
CoreML conversion of Stable Diffusion 2.0 Inpainting for Apple Silicon devices (iPhone, iPad, Mac).
Model Details
| Property | Value |
|---|---|
| Base Model | Stable Diffusion 2.0 Inpainting |
| Resolution | 512x512 |
| UNet Channels | 9 (latent + mask + masked image) |
| Prediction Type | V-Prediction |
| Attention | SPLIT_EINSUM (optimized for ANE) |
| Safety Checker | Not included |
Known Issues
This model may produce suboptimal results due to:
- V-Prediction: SD 2.0 uses v-prediction instead of epsilon-prediction (SD 1.5), requiring different scheduler math
- Quality: Some users report blurry or inconsistent outputs
- Prompt Following: May not follow prompts as accurately as SD 1.5-based models
Recommended Alternative
For better results, use Realistic Vision Inpainting:
- Based on SD 1.5 (epsilon-prediction)
- Higher quality photorealistic output
- Includes NSFW safety checker
- Better prompt adherence
Files
| File | Size | Description |
|---|---|---|
sd2-inpainting-coreml.zip |
2.39 GB | Full model bundle |
Bundle Contents
Resources/
βββ TextEncoder.mlmodelc # CLIP text encoder
βββ Unet.mlmodelc # 9-channel inpainting UNet
βββ VAEDecoder.mlmodelc # Latent to image decoder
βββ VAEEncoder.mlmodelc # Image to latent encoder
βββ vocab.json # Tokenizer vocabulary
βββ merges.txt # BPE merges
Usage Notes
If using this model, your inpainting pipeline must handle v-prediction correctly:
# V-prediction formula (SD 2.0)
xβ = Ξ±β Β· xβ - Οβ Β· v
# vs Epsilon-prediction formula (SD 1.5)
xβ = (xβ - Οβ Β· Ξ΅) / Ξ±β
Ensure your scheduler implements v-prediction math, otherwise output will be corrupted.
License
This model is released under the CreativeML Open RAIL-M License.
You CAN:
- Use commercially
- Redistribute
- Modify and create derivatives
You MUST:
- Include license and attribution
- Not use for illegal purposes
- Not generate content exploiting minors
- Not use for harassment or deception
Attribution
- Original Model: Stable Diffusion 2.0 Inpainting by Stability AI
- CoreML Conversion: Using Apple ml-stable-diffusion
Conversion Details
Converted using Apple's ml-stable-diffusion toolkit:
python -m python_coreml_stable_diffusion.torch2coreml \
--model-version stabilityai/stable-diffusion-2-inpainting \
--convert-unet \
--convert-text-encoder \
--convert-vae-decoder \
--convert-vae-encoder \
--attention-implementation SPLIT_EINSUM \
--bundle-resources-for-swift-cli \
-o output
Related
Model tree for jc-builds/sd2-inpainting-coreml
Base model
stabilityai/stable-diffusion-2-inpainting