AI & ML interests

None defined yet.

Recent Activity

Articles

lightonai 's collections 14

DenseOn & LateOn
A collection of open state-of-the-art single and multi-vector models
ColBERT-Zero 🐢
First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT
Ettin
A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B
PAGnol πŸ‡«πŸ‡·
French language models. These model were trained in early 2021 following the then scaling laws and using the exact same training data as the CamemBERT
LightOnOCR-2 πŸ¦‰
LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family
LateOn-Code πŸ’»
State-of-the-art late interaction code retrieval models
PyLate πŸ•
State-of-the-art late interaction models trained using PyLate
ModernBERT
Bringing BERT into modernity via both architecture changes and scaling
RITA 🧿
A suite of autoregressive generative models for protein sequences, with up to 1.2Bparameters, trained on over 280 million protein sequences.
DenseOn & LateOn
A collection of open state-of-the-art single and multi-vector models
LightOnOCR-2 πŸ¦‰
LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family
ColBERT-Zero 🐢
First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT
LateOn-Code πŸ’»
State-of-the-art late interaction code retrieval models
PyLate πŸ•
State-of-the-art late interaction models trained using PyLate
Ettin
A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B
ModernBERT
Bringing BERT into modernity via both architecture changes and scaling
PAGnol πŸ‡«πŸ‡·
French language models. These model were trained in early 2021 following the then scaling laws and using the exact same training data as the CamemBERT
RITA 🧿
A suite of autoregressive generative models for protein sequences, with up to 1.2Bparameters, trained on over 280 million protein sequences.