AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 19 days ago • 101
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 25 days ago • 231
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction Paper • 2605.26230 • Published 7 days ago • 39
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 6 days ago • 128
ResearchMath-14K: Scaling Research-Level Mathematics via Agents Paper • 2605.28003 • Published 5 days ago • 46
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 5 days ago • 405
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 5 days ago • 79
WorldKV: Efficient World Memory with World Retrieval and Compression Paper • 2605.22718 • Published 11 days ago • 41
FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching Paper • 2605.20910 • Published 12 days ago • 29
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models Paper • 2603.21341 • Published Mar 22 • 24
SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models Paper • 2602.04208 • Published Feb 4 • 20
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Paper • 2605.16257 • Published 17 days ago • 52
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 28 days ago • 347
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published Apr 6 • 46