PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 111
Group-in-Group Policy Optimization for LLM Agent Training Paper • 2505.10978 • Published May 16, 2025 • 18
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration Paper • 2508.11379 • Published Aug 15, 2025 • 12
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success Paper • 2508.04280 • Published Aug 6, 2025 • 35
Reinforcement Learning for Long-Horizon Interactive LLM Agents Paper • 2502.01600 • Published Feb 3, 2025 • 1
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! Mar 7, 2025 • 89
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 23 items • Updated 13 days ago • 103
Position: Interactive Generative Video as Next-Generation Game Engine Paper • 2503.17359 • Published Mar 21, 2025 • 61
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published Feb 25, 2025 • 67
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20, 2025 • 174
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published Feb 13, 2025 • 37
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published Feb 5, 2025 • 60
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3, 2025 • 113
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 123
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 38
Mechanistic Permutability: Match Features Across Layers Paper • 2410.07656 • Published Oct 10, 2024 • 20