Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.15301

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Intresting Papers

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Representation Learning

End-to-End Vision Tokenizer Tuning

Paper • 2505.10562 • Published May 15, 2025 • 22
Global and Local Entailment Learning for Natural World Imagery

Paper • 2506.21476 • Published Jun 26, 2025 • 1
DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 291
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published Sep 1, 2025 • 58

Generative Modeling Approachs

Efficient Generative Modeling with Residual Vector Quantization-Based Tokens

Paper • 2412.10208 • Published Dec 13, 2024 • 19
Normalizing Flows are Capable Generative Models

Paper • 2412.06329 • Published Dec 9, 2024 • 11
A Noise is Worth Diffusion Guidance

Paper • 2412.03895 • Published Dec 5, 2024 • 29
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 103
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 39
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 120

Is Noise Conditioning Necessary for Denoising Generative Models?

Paper • 2502.13129 • Published Feb 18, 2025 • 1
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers

Paper • 2504.10483 • Published Apr 14, 2025 • 21
Mean Flows for One-step Generative Modeling

Paper • 2505.13447 • Published May 19, 2025 • 7
Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9, 2025 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6, 2025 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12, 2025 • 74
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 54
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22, 2025 • 22
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 29
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31, 2025 • 24
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13, 2025 • 8

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Is Noise Conditioning Necessary for Denoising Generative Models?

Paper • 2502.13129 • Published Feb 18, 2025 • 1
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers

Paper • 2504.10483 • Published Apr 14, 2025 • 21
Mean Flows for One-step Generative Modeling

Paper • 2505.13447 • Published May 19, 2025 • 7
Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Intresting Papers

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9, 2025 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6, 2025 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27

Representation Learning

End-to-End Vision Tokenizer Tuning

Paper • 2505.10562 • Published May 15, 2025 • 22
Global and Local Entailment Learning for Natural World Imagery

Paper • 2506.21476 • Published Jun 26, 2025 • 1
DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 291
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published Sep 1, 2025 • 58

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12, 2025 • 74
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 54
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22, 2025 • 22
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34

Generative Modeling Approachs

Efficient Generative Modeling with Residual Vector Quantization-Based Tokens

Paper • 2412.10208 • Published Dec 13, 2024 • 19
Normalizing Flows are Capable Generative Models

Paper • 2412.06329 • Published Dec 9, 2024 • 11
A Noise is Worth Diffusion Guidance

Paper • 2412.03895 • Published Dec 5, 2024 • 29
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 29
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31, 2025 • 24
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13, 2025 • 8

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 103
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 39
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 120

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs