Instructions to use delimitter/qwen2.5-0.5b-synoema-tools-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use delimitter/qwen2.5-0.5b-synoema-tools-v1 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-coder-0.5b-instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "delimitter/qwen2.5-0.5b-synoema-tools-v1") - Notebooks
- Google Colab
- Kaggle
Qwen2.5-Coder-0.5B — Synoema Tools v1
LoRA adapter fine-tuned on Qwen2.5-Coder-0.5B-Instruct for agentic tool-use with the Synoema MCP server.
Score: 92.9% (26/28 tasks) on the Synoema MCP agentic evaluation benchmark.
What is Synoema?
Synoema is an LLM-native programming language and runtime:
- BPE-aligned operators — every operator maps to exactly 1 cl100k_base token
- GBNF grammar for constrained decoding (structural correctness guarantee)
- Cranelift JIT + WebAssembly compilation targets
- MCP server exposing
file_write,file_read,sno_typecheck,sno_run,search_corpustools - Contract annotations (
requires/ensures) for formal verification
Model Details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen2.5-Coder-0.5B-Instruct (via unsloth 4-bit) |
| Method | QLoRA (4-bit NF4 quantization + LoRA) |
| LoRA rank | r=8, alpha=32 |
| Target modules | q/k/v/o proj + gate/up/down proj (all attention + FFN) |
| Batch | 4 × grad_accum=4 = effective batch 16 |
| Sequence length | 1024 tokens |
| Epochs | 3 per cycle |
| Optimizer | AdamW with cosine decay |
| Training corpus | ~14,778 examples (tool-use + codegen) |
| Training time | ~84 min/cycle × 8 cycles = ~11h total carousel |
| Training hardware | AMD RX 7900 GRE 16GB (ROCm + unsloth) |
| Carousel cycles | 8 cycles (C1→C8), each starting from best previous |
| Cycle C8 loss | 0.022 |
Training Approach: Carousel Fine-tuning
This model was trained using a carousel strategy:
Base model → C1 (eval) → C2 (eval) → ... → C8 (best: 92.9%)
↑ always from best adapter
Each cycle:
- Merge corpus — base corpus + all targeted examples for failing tasks
- Train 3 epochs from the best previous adapter
- Eval on 28 agentic tasks (real tool calls, real typecheck)
- Analyze failures → generate targeted examples → add to corpus
- Repeat from best adapter
Evaluation: 28 Agentic Tasks
The model is evaluated on real multi-turn tool-use scenarios. Each task requires
calling MCP tools correctly in sequence. The eval runs actual sno typecheck and
sno run commands — no mock results.
Result: 26/28 tasks passed (92.9%)
| Category | Tasks | Passed |
|---|---|---|
| Basic write + typecheck | TU1, TU2, TU3, TU5, TU6 | 5/5 ✅ |
| Multi-step (search→write→run) | TU9, TU20 | 2/2 ✅ |
| Language features (cons/ADT/HOF) | TU11, TU14–TU19 | 7/7 ✅ |
| Ternary + complex expressions | TU22, TU30 | 2/2 ✅ |
| List comprehension | TU12, TU26 | 2/2 ✅ |
| Write-only (no run) | TU10 | 1/1 ✅ |
| String ops | TU18, TU25 | 2/2 ✅ |
| Pattern matching | TU11, TU19 | 2/2 ✅ |
| Fix error (if/else → ternary) | TU4, TU13 | 0/2 ❌ |
Remaining failures:
- TU4 — must write
if x > y then x else y, see typecheck error, then fix to? x > y -> x : y(2-write pattern) - TU13 — same pattern with
classify n = if n > 0 then 1 else 0→? n > 0 -> 1 : 0
Both require a strict write→typecheck→rewrite→typecheck sequence with exactly 2 file_write calls.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct")
model = PeftModel.from_pretrained(base, "Delimitter/qwen2.5-0.5b-synoema-tools-v1")
With unsloth (recommended for inference):
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Delimitter/qwen2.5-0.5b-synoema-tools-v1",
max_seq_length=1024,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
System prompt format (ChatML):
<|im_start|>system
You are an AI coding assistant for the Synoema programming language...
<|im_end|>
<|im_start|>user
Write a quicksort in Synoema to src/qs.sno and run it.
<|im_end|>
<|im_start|>assistant
Corpus Composition
| Source | Examples | Description |
|---|---|---|
tool_use_train_v17_fix.jsonl |
676 | Fix-error patterns (if/else→ternary) |
tool_use_train_v16_gen.jsonl |
~3500 | Write+check+run patterns |
tool_use_train_lang_v1.jsonl |
~3000 | Synoema language codegen |
targeted_seq_c* files |
~400 | Carousel-generated targeted examples |
| Other validated sources | ~7200 | Mixed tool-use patterns |
| Total | ~14,778 |
All examples validated with sno check + sno run before training.
Training History (Carousel)
| Cycle | Score | Failing tasks |
|---|---|---|
| C1 | 89.3% (25/28) | TU4, TU13, TU20 |
| C2 | 82.1% (23/28) | TU4, TU9, TU12, TU13, TU20 |
| C3 | 85.7% (24/28) | TU4, TU12, TU13, TU20 |
| C4 | 78.6% (22/28) | TU4, TU10, TU12, TU13, TU20 |
| C5 | 85.7% (24/28) | TU4, TU12, TU13, TU20 |
| C6 | 85.7% (24/28) | TU4, TU12, TU13, TU20 |
| C7 | 89.3% (25/28) | TU4, TU12, TU13 |
| C8 | 92.9% (26/28) 🏆 | TU4, TU13 |
| C9+ | 50–82% | Catastrophic forgetting |
C8 was selected as best before catastrophic forgetting set in at C9.
Synoema Language Quick Reference
-- Ternary (no if/else!)
max x y = ? x > y -> x : y
-- Pattern matching
fact 0 = 1
fact n = n * fact (n - 1)
-- List comprehension
evens = [x | x <- [1..20], x % 2 == 0]
-- Space-separated lists (NOT commas)
main = qsort [3 1 4 1 5 9]
-- ADT
Shape = Circle Int | Rect Int Int
area (Circle r) = 3 * r * r
License
Apache 2.0 — same as Qwen2.5-Coder base model.
Synoema is © Andrey Bubnov. See synoema.tech.
- Downloads last month
- 16
Evaluation results
- 28-Task Agentic Eval (26/28 pass@1)self-reported0.929