AELXKANG21 's Collections any size diffusion
updated
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size
HD Images
Paper
• 2308.16582
• Published
• 12
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture
Propagation
Paper
• 2310.13119
• Published
• 13
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion
Prior
Paper
• 2310.16818
• Published
• 33
Text-to-3D with classifier score distillation
Paper
• 2310.19415
• Published
• 5
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and
Prediction
Paper
• 2310.20700
• Published
• 10
Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Paper
• 2306.07280
• Published
• 25
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction
Model
Paper
• 2311.09217
• Published
• 22
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View
Generation and 3D Diffusion
Paper
• 2311.07885
• Published
• 40
Emu Video: Factorizing Text-to-Video Generation by Explicit Image
Conditioning
Paper
• 2311.10709
• Published
• 25
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval
Score Matching
Paper
• 2311.11284
• Published
• 20
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper
• 2311.12092
• Published
• 22
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on
Diffusion Models
Paper
• 2311.13141
• Published
• 16
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper
• 2311.13600
• Published
• 47
An Embodied Generalist Agent in 3D World
Paper
• 2311.12871
• Published
• 8
LEDITS++: Limitless Image Editing using Text-to-Image Models
Paper
• 2311.16711
• Published
• 25
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Paper
• 2312.00093
• Published
• 17
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Paper
• 2311.18775
• Published
• 6
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper
• 2312.00777
• Published
• 24
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion
Models
Paper
• 2312.00079
• Published
• 17
Segment and Caption Anything
Paper
• 2312.00869
• Published
• 20
PyNeRF: Pyramidal Neural Radiance Fields
Paper
• 2312.00252
• Published
• 11
DeepCache: Accelerating Diffusion Models for Free
Paper
• 2312.00858
• Published
• 23
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training
Paper
• 2312.01663
• Published
• 6
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian
Splatting
Paper
• 2312.03461
• Published
• 17
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Paper
• 2312.03209
• Published
• 21
TokenCompose: Grounding Diffusion with Token-level Supervision
Paper
• 2312.03626
• Published
• 5
HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces
Paper
• 2312.03160
• Published
• 8
Context Diffusion: In-Context Aware Image Generation
Paper
• 2312.03584
• Published
• 15
MotionCtrl: A Unified and Flexible Motion Controller for Video
Generation
Paper
• 2312.03641
• Published
• 22
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
Paper
• 2312.03079
• Published
• 15
Customizing Motion in Text-to-Video Diffusion Models
Paper
• 2312.04966
• Published
• 11
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D
priors
Paper
• 2312.04963
• Published
• 17
MVDD: Multi-View Depth Diffusion Models
Paper
• 2312.04875
• Published
• 10
3D-LLM: Injecting the 3D World into Large Language Models
Paper
• 2307.12981
• Published
• 40
DreaMoving: A Human Dance Video Generation Framework based on Diffusion
Models
Paper
• 2312.05107
• Published
• 39
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World
Video Super-Resolution
Paper
• 2312.06640
• Published
• 49
NeRFiller: Completing Scenes via Generative 3D Inpainting
Paper
• 2312.04560
• Published
• 12
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point
Clouds Generation
Paper
• 2312.07231
• Published
• 10
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper
• 2312.08128
• Published
• 13
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Paper
• 2312.07661
• Published
• 19
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D
Generation
Paper
• 2312.08754
• Published
• 11
VideoLCM: Video Latent Consistency Model
Paper
• 2312.09109
• Published
• 23
Mosaic-SDF for 3D Generative Models
Paper
• 2312.09222
• Published
• 17
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Paper
• 2312.09067
• Published
• 15
FineControlNet: Fine-level Text Control for Image Generation with
Spatially Aligned Text Control Injection
Paper
• 2312.09252
• Published
• 12
Pixel Aligned Language Models
Paper
• 2312.09237
• Published
• 16
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
Paper
• 2312.09246
• Published
• 8
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained
Geometry and Appearance
Paper
• 2312.08889
• Published
• 15
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
• 2312.09608
• Published
• 16
Stable Score Distillation for High-Quality 3D Generation
Paper
• 2312.09305
• Published
• 10
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language
Models
Paper
• 2312.04533
• Published
• 1
Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint
Method
Paper
• 2312.12030
• Published
• 6
TIP: Text-Driven Image Processing with Semantic and Restoration
Instructions
Paper
• 2312.11595
• Published
• 6
Tracking Any Object Amodally
Paper
• 2312.12433
• Published
• 12
MixRT: Mixed Neural Representations For Real-Time NeRF Rendering
Paper
• 2312.11841
• Published
• 11
Customize-It-3D: High-Quality 3D Creation from A Single Image Using
Subject-Specific Knowledge Prior
Paper
• 2312.11535
• Published
• 7
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
Paper
• 2312.11461
• Published
• 19
Repaint123: Fast and High-quality One Image to 3D Generation with
Progressive Controllable 2D Repainting
Paper
• 2312.13271
• Published
• 5
SpecNeRF: Gaussian Directional Encoding for Specular Reflections
Paper
• 2312.13102
• Published
• 6
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Paper
• 2312.12490
• Published
• 19
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive
Generation
Paper
• 2312.12491
• Published
• 75
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper
• 2312.13150
• Published
• 15
UniSDF: Unifying Neural Representations for High-Fidelity 3D
Reconstruction of Complex Scenes with Reflections
Paper
• 2312.13285
• Published
• 6
Model-Based Control with Sparse Neural Dynamics
Paper
• 2312.12791
• Published
• 6
LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset
Paper
• 2312.12418
• Published
• 2
Neural feels with neural fields: Visuo-tactile perception for in-hand
manipulation
Paper
• 2312.13469
• Published
• 11
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Paper
• 2312.13913
• Published
• 24
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image
Inpainting with Diffusion Models
Paper
• 2312.14091
• Published
• 17
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs
Paper
• 2312.14140
• Published
• 7
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Paper
• 2312.13834
• Published
• 26
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion
Models with RL Finetuning
Paper
• 2312.13980
• Published
• 14
ControlRoom3D: Room Generation using Semantic Proxy Rooms
Paper
• 2312.05208
• Published
• 9
DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular
Video
Paper
• 2312.13528
• Published
• 7
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View
Synthesis
Paper
• 2312.13016
• Published
• 6
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors
Paper
• 2312.13324
• Published
• 11
MACS: Mass Conditioned 3D Hand and Object Motion Synthesis
Paper
• 2312.14929
• Published
• 6
LangSplat: 3D Language Gaussian Splatting
Paper
• 2312.16084
• Published
• 16
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Paper
• 2312.16457
• Published
• 15
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View
Synthesis
Paper
• 2312.16812
• Published
• 10
Restoration by Generation with Constrained Priors
Paper
• 2312.17161
• Published
• 4
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption
by Combining 3D GANs and Diffusion Priors
Paper
• 2312.16837
• Published
• 6
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with
Time-Decoupled Training and Reusable Coop-Diffusion
Paper
• 2312.16486
• Published
• 7
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
Paper
• 2312.16693
• Published
• 14
Prompt Expansion for Adaptive Text-to-Image Generation
Paper
• 2312.16720
• Published
• 6
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D
Synthetic Data
Paper
• 2401.01173
• Published
• 12
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper
• 2401.01647
• Published
• 13
Efficient Hybrid Zoom using Camera Fusion on Mobile Phones
Paper
• 2401.01461
• Published
• 9
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper
• 2401.01952
• Published
• 32
Denoising Vision Transformers
Paper
• 2401.02957
• Published
• 31
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for
Open-vocabulary 3D Object Detection
Paper
• 2310.02960
• Published
• 2
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Paper
• 2401.04468
• Published
• 49
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Paper
• 2401.04092
• Published
• 21
ODIN: A Single Model for 2D and 3D Perception
Paper
• 2401.02416
• Published
• 13
PIXART-δ: Fast and Controllable Image Generation with Latent
Consistency Models
Paper
• 2401.05252
• Published
• 49
URHand: Universal Relightable Hands
Paper
• 2401.05334
• Published
• 25
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes
Paper
• 2401.05335
• Published
• 29
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper
• 2312.04461
• Published
• 62
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse
Text-to-3D Generation
Paper
• 2401.07727
• Published
• 11
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation
Paper
• 2401.08559
• Published
• 9
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper
• 2401.10061
• Published
• 32
VMamba: Visual State Space Model
Paper
• 2401.10166
• Published
• 40
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper
• 2401.09962
• Published
• 9
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
Interpolant Transformers
Paper
• 2401.08740
• Published
• 14
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning
Capabilities
Paper
• 2401.12168
• Published
• 29
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
Generating with Multimodal LLMs
Paper
• 2401.11708
• Published
• 30
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper
• 2401.10891
• Published
• 62
Fast Registration of Photorealistic Avatars for VR Facial Animation
Paper
• 2401.11002
• Published
• 2
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper
• 2401.14257
• Published
• 12
SPAD : Spatially Aware Multiview Diffusers
Paper
• 2402.05235
• Published
• 3
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality
3D Generation
Paper
• 2402.08682
• Published
• 14
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Paper
• 2402.10210
• Published
• 35
CityDreamer: Compositional Generative Model of Unbounded 3D Cities
Paper
• 2309.00610
• Published
• 21
Multistep Consistency Models
Paper
• 2403.06807
• Published
• 15
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
• 2403.19887
• Published
• 112
Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field
Representation and Generation
Paper
• 2403.19319
• Published
• 14
FruitNeRF: A Unified Neural Radiance Field based Fruit Counting
Framework
Paper
• 2408.06190
• Published
• 18
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed
Representations
Paper
• 2408.12590
• Published
• 35
Towards Realistic Example-based Modeling via 3D Gaussian Stitching
Paper
• 2408.15708
• Published
• 8
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world
Videos
Paper
• 2409.02095
• Published
• 37
Open-MAGVIT2: An Open-Source Project Toward Democratizing
Auto-regressive Visual Generation
Paper
• 2409.04410
• Published
• 25
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video
Diffusion Models
Paper
• 2409.07452
• Published
• 21
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Paper
• 2409.11355
• Published
• 30
Portrait Video Editing Empowered by Multimodal Generative Priors
Paper
• 2409.13591
• Published
• 16
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Paper
• 2412.03079
• Published
• 2