merve's picture

Building on HF

merve PRO

merve

huggingface

·

https://github.com/merveenoyan/smol-vision

AI & ML interests

I love this website VLMs, vision & co

Recent Activity

liked a model 12 days ago

nvidia/magpie_tts_multilingual_357m

liked a model 12 days ago

zai-org/GLM-4.7

updated a dataset 12 days ago

merve/personal-website

View all activity

Organizations

published an article 16 days ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

16 days ago

•

91

published an article 2 months ago

Article

Streaming datasets: 100x More Efficient

+3

Oct 27, 2025

•

75

published an article 2 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21, 2025

•

289

published an article 3 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

Sep 23, 2025

•

134

published an article 5 months ago

Article

Vision Language Model Alignment in TRL ⚡️

+3

Aug 7, 2025

•

105

published an article 6 months ago

Article

Introducing ColQwen-Omni: Retrieve in every modality

Jul 17, 2025

•

75

published an article 7 months ago

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

+3

Jun 19, 2025

•

95

published an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

+5

Jun 12, 2025

•

151

published an article 7 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

Jun 3, 2025

•

305

published an article 8 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21, 2025

•

247

published an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

580

published an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

580

published an article 8 months ago

Article

Welcoming Llama Guard 4 on Hugging Face Hub

+2

Apr 29, 2025

•

40

published an article 9 months ago

Article

Cohere on Hugging Face Inference Providers 🔥

+5

Apr 16, 2025

•

129

published an article 10 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12, 2025

•

480

published an article 11 months ago

Article

SigLIP 2: A better multilingual vision language encoder

+1

Feb 21, 2025

•

193

published an article 11 months ago

Article

SigLIP 2: A better multilingual vision language encoder

+1

Feb 21, 2025

•

193

published an article 11 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

Feb 20, 2025

•

320

published an article 11 months ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

+1

Feb 19, 2025

•

74