2 49

magicwpf

https://magicwpf.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

upvoted a paper 9 days ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

upvoted a paper 10 days ago

SemanticGen: Video Generation in Semantic Space

View all activity

Organizations

None yet

authored 4 papers 8 months ago

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Paper • 2410.08260 • Published Oct 10, 2024

SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning

Paper • 2504.00396 • Published Apr 1, 2025 • 3

HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment

Paper • 2503.23907 • Published Mar 31, 2025 • 3

A Survey of Interactive Generative Video

Paper • 2504.21853 • Published Apr 30, 2025 • 46

authored 5 papers 9 months ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published Mar 31, 2025 • 76

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18, 2025 • 28

authored 11 papers 10 months ago

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Paper • 2408.11813 • Published Aug 21, 2024 • 12

MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding

Paper • 2410.21747 • Published Oct 29, 2024

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Paper • 2412.07760 • Published Dec 10, 2024 • 55

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Paper • 2412.07759 • Published Dec 10, 2024 • 18

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published Dec 10, 2024 • 20

VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video Local Editing

Paper • 2411.15260 • Published Nov 22, 2024

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Paper • 2411.14423 • Published Nov 21, 2024

Towards Precise Scaling Laws for Video Diffusion Transformers

Paper • 2411.17470 • Published Nov 25, 2024 • 1

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

Paper • 2312.13305 • Published Dec 20, 2023

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Paper • 2501.04698 • Published Jan 8, 2025 • 15

GameFactory: Creating New Games with Generative Interactive Videos

Paper • 2501.08325 • Published Jan 14, 2025 • 67

magicwpf

AI & ML interests

Recent Activity

Organizations

magicwpf's activity