DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper โข 2512.16676 โข Published 19 days ago โข 202
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper โข 2512.17532 โข Published 18 days ago โข 65
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Text Generation โข 32B โข Updated about 21 hours ago โข 272k โข 531
view post Post 2399 NEW: @mistralai released a fantastic family of multimodal models, Ministral 3. You can fine-tune them for free on Colab using TRL โก๏ธ, supporting both SFT and GRPOLink to the notebooks:- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb- TRL and more examples: https://huggingface.co/docs/trl/index See translation 2 replies ยท ๐ฅ 8 8 + Reply
view post Post 1733 Interested in RL training environments?We just released a beginner-friendly walkthrough notebook!Train a model to play Wordle using TRL + OpenEnv (TextArena) + GRPO + vLLM.happy learning! ๐ฑNotebook: https://github.com/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynbOpenEnv guide in TRL: https://huggingface.co/docs/trl/main/en/openenv See translation ๐ 8 8 + Reply