LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling 29 days ago • 50
LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR Oct 23, 2025 • 73
LightOn Optical Processing Unit: Scaling-up AI and HPC with a Non von Neumann co-processor Paper • 2107.11814 • Published Jul 25, 2021
Is the Number of Trainable Parameters All That Actually Matters? Paper • 2109.11928 • Published Sep 24, 2021
ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models Paper • 2602.16609 • Published 22 days ago • 6
ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models Paper • 2602.16609 • Published 22 days ago • 6
ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models Paper • 2602.16609 • Published 22 days ago • 6
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer Paper • 2510.05846 • Published Oct 7, 2025 • 3
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15, 2025 • 31
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP Paper • 2506.10896 • Published Jun 12, 2025 • 4
Splitformer: An improved early-exit architecture for automatic speech recognition on edge devices Paper • 2506.18035 • Published Jun 22, 2025
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 161
Multitask Prompted Training Enables Zero-Shot Task Generalization Paper • 2110.08207 • Published Oct 15, 2021 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 37
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 161
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 161