arcee-train (Arcee Training Org)

You can now run Qwen3.5 locally! 💜
Qwen3.5-397B-A17B is an open MoE vision reasoning LLM for agentic coding & chat. It performs on par with Gemini 3 Pro, Claude Opus 4.5 & GPT-5.2.

GGUF: unsloth/Qwen3.5-397B-A17B-GGUF
Run Dynamic 3-bit on a 192GB Mac for 20 tokens/s.

Guide: https://unsloth.ai/docs/models/qwen3.5

9 replies

·

danielhanchen

posted an update 8 days ago

Post

8376

You can now run MiniMax-2.5 locally! 🚀
At 230B parameters, MiniMax-2.5 is the strongest LLM under 700B params, delivering SOTA agentic coding & chat.

Run Dynamic 3/4-bit on a 128GB Mac for 20 tokens/s.
Guide: https://unsloth.ai/docs/models/minimax-2.5
GGUF: unsloth/MiniMax-M2.5-GGUF

1 reply

·

MaziyarPanahi

posted an update 13 days ago

Post

1795

Announcing: OpenMed Multilingual PII Detection Models

Today I am releasing 105 open-source models for Personally Identifiable Information (PII) detection in French, German, and Italian.

All Apache 2.0 licensed. Free for commercial use. No restrictions.

Performance:

- French: 97.97% F1 (top model)
- German: 97.61% F1 (top model)
- Italian: 97.28% F1 (top model)

All top-10 models per language exceed 96% F1

Coverage:

55+ PII entity types per language
Native ID formats: NSS (French), Sozialversicherungsnummer (German), Codice Fiscale (Italian)
Language-specific address, phone, and name patterns

Training Data:

French: 49,580 samples
German: 42,250 samples
Italian: 40,944 samples

Why Multilingual?

European healthcare operates in European languages. Clinical notes, patient records, and medical documents are generated in French, German, Italian, and other languages.

Effective de-identification requires:

- Native language understanding — not translation
- Local ID format recognition — each country has unique patterns
- Cultural context awareness — names, addresses, and formats vary
- These models deliver production-ready accuracy without requiring data to leave your infrastructure or language.

HIPAA & GDPR Compliance
Built for US and European privacy regulations:

- On-premise deployment: Process data locally with zero external dependencies
- Data sovereignty: No API calls, no cloud services, no cross-border transfers
- Air-gapped capable: Deploy in fully isolated environments if required
- Regulatory-grade accuracy: Supporting Expert Determination standards
- HIPAA and GDPR compliance across languages, without compliance gaps.

Use Cases
- Hospital EHR systems: Automated patient record de-identification
- Clinical research: Multilingual dataset preparation for studies
- Insurance companies: Claims processing across

https://huggingface.co/collections/OpenMed/multilingual-pii-and-de-identification

1 reply

·

danielhanchen

posted an update 13 days ago

Post

5129

We collaborated with Hugging Face to enable you to train MoE models 12× faster with 35% less VRAM via our new Triton kernels (no accuracy loss). 🤗

Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe

1 reply

·

MaziyarPanahi

posted an update 16 days ago

Post

1203

From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output

I ran 6 experiments trying to use Anthropic's SAE steering for JSON generation.

- Base model: 86.8% valid JSON
- Steering only: 24.4%
- Fine-tuned: 96.6%
- FSM constrained: 100%

Steering is for semantics, not syntax.

https://huggingface.co/blog/MaziyarPanahi/sae-steering-json

MaziyarPanahi

posted an update 17 days ago

Post

3939

🚨 Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE

Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.

THE 7 SOURCE MODELS (Original Sample Counts):

1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples

TOTAL BEFORE DEDUPLICATION: 2,919,699 samples

TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%

Quick Start:

from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")

All datasets Apache 2.0 licensed. Free for research and commercial use.

Thank you for following OpenMed's release series. I can't wait to see what you build. 🔥

OpenMed/Medical-Reasoning-SFT-Mega
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
OpenMed/Medical-Reasoning-SFT-Trinity-Mini
OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
https://huggingface.co/datasets/OpenMed/Medical-Reasonin

https://huggingface.co/collections/OpenMed/medical-datasets

6 replies

·

danielhanchen

posted an update 18 days ago

Post

3671

We created a tool-calling guide for local LLMs!

Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling.

Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms

We provide hands-on examples for: story writing, Python execution, terminal tool calls, maths and more.

7 replies

·

danielhanchen

posted an update 20 days ago

Post

3744

Qwen releases Qwen3-Coder-Next! 💜 Run the locally on 46GB RAM or less.

Thhe model excels at agentic coding & local use. With 256K context, it delivers similar performance to models with 10-20× more active parameters.

GGUF: unsloth/Qwen3-Coder-Next-GGUF
Guide: https://unsloth.ai/docs/models/qwen3-coder-next

10 replies

·

danielhanchen

posted an update 26 days ago

Post

3428

You can now run Kimi K2.5 locally! 🔥

We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit.
Get >40 tok/s on 242GB or 622GB VRAM/RAM for near full precision.

GGUF: unsloth/Kimi-K2.5-GGUF

Guide: https://unsloth.ai/docs/models/kimi-k2.5

7 replies

·

danielhanchen

posted an update about 1 month ago

Post

2615

You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!

3 replies

·

danielhanchen

posted an update about 1 month ago

Post

2627

Run GLM-4.7-Flash locally on your device with 24GB RAM!🔥

It's the best performing 30B model on SWE-Bench and GPQA. With 200K context, it excels at coding, agents, chat & reasoning.

GGUF: unsloth/GLM-4.7-Flash-GGUF

Guide: https://unsloth.ai/docs/models/glm-4.7-flash

Arcee Training Org

AI & ML interests

Recent Activity

Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

Arcee Trinity Large Technical Report

AI & ML interests

Recent Activity

Team members 44

arcee-train's activity