LGAI-EXAONE/K-EXAONE-236B-A23B Text Generation • 237B • Updated about 8 hours ago • 14k • 544
view article Article Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries Dec 22, 2025 • 7
🦢SWIM-IR Dataset [NAACL'24] Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated Mar 31, 2025 • 8
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 162
intfloat/multilingual-e5-base Sentence Similarity • 0.3B • Updated Feb 17, 2025 • 1.97M • • 334