Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
5
3
31
Alexandros Liapatis
alexliap
Follow
alexkiefer's profile picture
cristop's profile picture
hmb's profile picture
3 followers
Β·
7 following
alexliap
alexandros-liapatis
AI & ML interests
Generative AI + Traditional ML
Recent Activity
liked
a model
12 days ago
openai/privacy-filter
liked
a model
14 days ago
RedHatAI/Qwen3.6-35B-A3B-NVFP4
reacted
to
sergiopaniego
's
post
with π₯
18 days ago
Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy Andβ¦ it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team π Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple π How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder): https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py or benchmark a checkpoint with the eval script: https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train Γ T_eval, so a broad band of configs works well. even very noisy samples still help Want to dig deeper? Paper: https://huggingface.co/papers/2604.01193 Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer
View all activity
Organizations
None yet
alexliap
's datasets
2
Sort:Β Recently updated
alexliap/tinystories-gr
Viewer
β’
Updated
Mar 14
β’
2.14M
β’
69
alexliap/high-quality-gr-text
Viewer
β’
Updated
Feb 2
β’
5.03M
β’
95
β’
2