CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 2 days ago • 32
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells Paper • 2603.25240 • Published 7 days ago • 73
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 9
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 4 days ago • 121
On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers Paper • 2603.28762 • Published 3 days ago • 22
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 2 days ago • 37
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published 5 days ago • 134
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing Paper • 2603.28713 • Published 3 days ago • 15
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models Paper • 2603.27481 • Published 4 days ago • 33
view article Article How I contributed a new model to the Transformers library using Codex 3 days ago • 36
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation Paper • 2603.25804 • Published 7 days ago • 25
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published 7 days ago • 146
Representation Alignment for Just Image Transformers is not Easier than You Think Paper • 2603.14366 • Published 18 days ago • 13
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published 7 days ago • 55