In a Training Loop 🔄

Behrooz Azarkhalili

ermiaazarkhalili

AI & ML interests

LLMs, VLMs, PEFT, RL for LLMs and VLMs.

Recent Activity

published a model 5 days ago

ermiaazarkhalili/SmolLM2-135M-Instruct-GRPO-NuminaMath-50K

published a model 5 days ago

ermiaazarkhalili/SmolLM2-1.7B-Instruct-GRPO-NuminaMath-50K

published a model 5 days ago

ermiaazarkhalili/LFM2-2.6B-GRPO-NuminaMath-50K

View all activity

Organizations

upvoted an article 12 days ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

30 days ago

•

upvoted a paper 22 days ago

DeepCode: Open Agentic Coding

Paper • 2512.07921 • Published 30 days ago • 31

upvoted 5 articles about 1 month ago

Article

Building Deep Research: How we Achieved State of the Art

Nov 24, 2025

•

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

Sep 23, 2025

•

134

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

565

Article

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Dec 4, 2025

•

Article

Building Jobly: Semantic Job Matching with RAG and Vector Embeddings

Nov 28, 2025

•

upvoted an article 3 months ago

Article

Supercharge your OCR Pipelines with Open Models

Oct 21, 2025

•

295

upvoted a collection 3 months ago

ExGRPO

Collection

Model collections trained using ExGRPO. • 7 items • Updated Oct 3, 2025 • 1

upvoted a paper 5 months ago

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Paper • 2508.14896 • Published Aug 20, 2025 • 22

upvoted 5 articles 5 months ago

Article

Upskill your LLMs With Gradio MCP Servers

Jul 9, 2025

•

Article

Generate Images with Claude and Hugging Face

Aug 19, 2025

•

Article

Multimodal RAG with Colpali, Milvus and VLMs

Dec 10, 2024

•

Article

How I Built 7 Custom Gradio Components in Just 12 Days!

Aug 12, 2025

•

Article

Vision Language Model Alignment in TRL ⚡️

Aug 7, 2025

•

105

upvoted a collection 5 months ago

Qwen3-MegaScience

Collection

Qwen3-MegaScience • 5 items • Updated Jul 23, 2025 • 4

upvoted a paper 5 months ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22, 2025 • 63

upvoted an article 5 months ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

Jul 29, 2025

•

206

upvoted a collection 6 months ago

Kimi-K2

Collection

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated Nov 14, 2025 • 162

upvoted an article 6 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11, 2025

•

Behrooz Azarkhalili

AI & ML interests

Recent Activity

Organizations

ermiaazarkhalili's activity

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Building Deep Research: How we Achieved State of the Art

Smol2Operator: Post-Training GUI Agents for Computer Use

We Got Claude to Fine-Tune an Open Source LLM

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Building Jobly: Semantic Job Matching with RAG and Vector Embeddings

Supercharge your OCR Pipelines with Open Models

Upskill your LLMs With Gradio MCP Servers

Generate Images with Claude and Hugging Face

Multimodal RAG with Colpali, Milvus and VLMs

How I Built 7 Custom Gradio Components in Just 12 Days!

Vision Language Model Alignment in TRL ⚡️

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment