DreamFoley: Scalable VLMs for High-Fidelity Video-to-Audio Generation Paper • 2512.06022 • Published Dec 4, 2025 • 3
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture Paper • 2512.04810 • Published about 1 month ago • 25