Read 2026
updated
mHC: Manifold-Constrained Hyper-Connections
Paper
• 2512.24880
• Published
• 313
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper
• 2512.23988
• Published
• 19
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper
• 2512.25075
• Published
• 15
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper
• 2512.24176
• Published
• 8
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper
• 2512.24165
• Published
• 51
AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
Paper
• 2601.00796
• Published
• 32
Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning
Paper
• 2512.24146
• Published
• 14
Paper
• 2601.00417
• Published
• 34
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper
• 2601.03233
• Published
• 157
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models
Paper
• 2601.03044
• Published
• 28
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
Paper
• 2601.05242
• Published
• 228
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models
Paper
• 2601.03425
• Published
• 16
RelayLLM: Efficient Reasoning via Collaborative Decoding
Paper
• 2601.05167
• Published
• 31
AgentOCR: Reimagining Agent History via Optical Self-Compression
Paper
• 2601.04786
• Published
• 30
Over-Searching in Search-Augmented Large Language Models
Paper
• 2601.05503
• Published
• 7
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors
Paper
• 2601.07226
• Published
• 33
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
Paper
• 2601.07351
• Published
• 26
Dr. Zero: Self-Evolving Search Agents without Training Data
Paper
• 2601.07055
• Published
• 22
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale
Paper
• 2601.08225
• Published
• 52
The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents
Paper
• 2601.07264
• Published
• 24
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices
Paper
• 2601.08303
• Published
• 18
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
Paper
• 2601.09708
• Published
• 53
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments
Paper
• 2601.01075
• Published
• 6
The AI Hippocampus: How Far are We From Human Memory?
Paper
• 2601.09113
• Published
• 5
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
Paper
• 2601.08763
• Published
• 148
Alterbute: Editing Intrinsic Attributes of Objects in Images
Paper
• 2601.10714
• Published
• 31
Transition Matching Distillation for Fast Video Generation
Paper
• 2601.09881
• Published
• 33
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text
Paper
• 2601.10355
• Published
• 39
Language of Thought Shapes Output Diversity in Large Language Models
Paper
• 2601.11227
• Published
• 9
More Images, More Problems? A Controlled Analysis of VLM Failure Modes
Paper
• 2601.07812
• Published
• 6
Toward Efficient Agents: Memory, Tool learning, and Planning
Paper
• 2601.14192
• Published
• 56
Agentic Reasoning for Large Language Models
Paper
• 2601.12538
• Published
• 200
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper
• 2601.16208
• Published
• 53
PROGRESSLM: Towards Progress Reasoning in Vision-Language Models
Paper
• 2601.15224
• Published
• 12
360Anything: Geometry-Free Lifting of Images and Videos to 360°
Paper
• 2601.16192
• Published
• 8
Agentic Uncertainty Quantification
Paper
• 2601.15703
• Published
• 9
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences
Paper
• 2601.07251
• Published
• 11
Paper
• 2601.17237
• Published
• 10
Agentic Very Long Video Understanding
Paper
• 2601.18157
• Published
• 18
Shaping capabilities with token-level data filtering
Paper
• 2601.21571
• Published
• 27
KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices
Paper
• 2601.21579
• Published
• 6
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper
• 2601.22975
• Published
• 109
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper
• 2601.21468
• Published
• 25
LMK > CLS: Landmark Pooling for Dense Embeddings
Paper
• 2601.21525
• Published
• 4
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
Paper
• 2602.03139
• Published
• 42
Rethinking the Trust Region in LLM Reinforcement Learning
Paper
• 2602.04879
• Published
• 36
Protein Autoregressive Modeling via Multiscale Structure Generation
Paper
• 2602.04883
• Published
• 3
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration
Paper
• 2602.01734
• Published
• 32
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper
• 2602.08222
• Published
• 278
Towards Agentic Intelligence for Materials Science
Paper
• 2602.00169
• Published
• 46
Reliable and Responsible Foundation Models: A Comprehensive Survey
Paper
• 2602.08145
• Published
• 8
Col-Bandit: Zero-Shot Query-Time Pruning for Late-Interaction Retrieval
Paper
• 2602.02827
• Published
• 2
Stable Velocity: A Variance Perspective on Flow Matching
Paper
• 2602.05435
• Published
• 3
The Pensieve Paradigm: Stateful Language Models Mastering Their Own Context
Paper
• 2602.12108
• Published
• 13
Free(): Learning to Forget in Malloc-Only Reasoning Models
Paper
• 2602.08030
• Published
• 5
When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models
Paper
• 2602.10179
• Published
• 6
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics
Paper
• 2602.12617
• Published
• 20
Experiential Reinforcement Learning
Paper
• 2602.13949
• Published
• 70
SPILLage: Agentic Oversharing on the Web
Paper
• 2602.13516
• Published
Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks
Paper
• 2602.14689
• Published
• 1
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
Paper
• 2602.15322
• Published
• 9
Visual Persuasion: What Influences Decisions of Vision-Language Models?
Paper
• 2602.15278
• Published
• 3
The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems
Paper
• 2602.15382
• Published
• 2
Causal-JEPA: Learning World Models through Object-Level Latent Interventions
Paper
• 2602.11389
• Published
• 5
Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality
Paper
• 2602.14080
• Published
• 20
Multi-agent cooperation through in-context co-player inference
Paper
• 2602.16301
• Published
• 24
Reinforced Fast Weights with Next-Sequence Prediction
Paper
• 2602.16704
• Published
• 13
World Action Models are Zero-shot Policies
Paper
• 2602.15922
• Published
• 13
Unified Latents (UL): How to train your latents
Paper
• 2602.17270
• Published
• 57
DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers
Paper
• 2602.16968
• Published
• 12
NeST: Neuron Selective Tuning for LLM Safety
Paper
• 2602.16835
• Published
• 1
CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing
Paper
• 2602.15823
• Published
• 3
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper
• 2602.08354
• Published
• 259
Spanning the Visual Analogy Space with a Weight Basis of LoRAs
Paper
• 2602.15727
• Published
• 14
VLANeXt: Recipes for Building Strong VLA Models
Paper
• 2602.18532
• Published
• 52
On Data Engineering for Scaling LLM Terminal Capabilities
Paper
• 2602.21193
• Published
• 90
Test-Time Training with KV Binding Is Secretly Linear Attention
Paper
• 2602.21204
• Published
• 30
One-step Language Modeling via Continuous Denoising
Paper
• 2602.16813
• Published
• 4
Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking
Paper
• 2602.21196
• Published
• 4
The Diffusion Duality, Chapter II: Ψ-Samplers and Efficient Curriculum
Paper
• 2602.21185
• Published
• 3
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference
Paper
• 2602.21548
• Published
• 38
Image Generation with a Sphere Encoder
Paper
• 2602.15030
• Published
• 15
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
Paper
• 2602.21778
• Published
• 12
SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models
Paper
• 2602.18993
• Published
• 4
Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting
Paper
• 2602.20933
• Published
• 4
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale
Paper
• 2602.23361
• Published
• 13
Causal Motion Diffusion Models for Autoregressive Motion Generation
Paper
• 2602.22594
• Published
• 7
AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning
Paper
• 2602.23258
• Published
• 27