Yuseung "Phillip" Lee

phillipinseoul

https://phillipinseoul.github.io/

phillipinseoul

AI & ML interests

Computer Vision

Recent Activity

liked a dataset about 12 hours ago

Journey9ni/vstibench

upvoted a paper about 21 hours ago

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

upvoted a paper about 21 hours ago

Make Geometry Matter for Spatial Reasoning

View all activity

Organizations

liked a dataset about 12 hours ago

Journey9ni/vstibench

Viewer • Updated May 14, 2025 • 6.04k • 692 • 3

upvoted 2 papers about 21 hours ago

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

Paper • 2603.28069 • Published 3 days ago • 6

Make Geometry Matter for Spatial Reasoning

Paper • 2603.26639 • Published 5 days ago • 27

liked a dataset 3 days ago

cambridgeltl/DARE

Viewer • Updated Feb 11, 2025 • 8.73k • 48 • 6

upvoted a paper 6 days ago

MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models

Paper • 2603.25744 • Published 6 days ago • 12

upvoted 3 papers 8 days ago

RealMaster: Lifting Rendered Scenes into Photorealistic Video

Paper • 2603.23462 • Published 8 days ago • 31

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Paper • 2603.12254 • Published 20 days ago • 21

SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning

Paper • 2603.22057 • Published 9 days ago • 45

upvoted 4 papers 9 days ago

VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding

Paper • 2603.22285 • Published 9 days ago • 50

upvoted 2 papers 10 days ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published 15 days ago • 106

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Paper • 2603.19685 • Published 13 days ago • 19

liked a dataset 12 days ago

nyu-visionx/VSI-590K

Preview • Updated Nov 7, 2025 • 1.76k • 18

upvoted 2 papers 13 days ago

Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models

Paper • 2603.18002 • Published 14 days ago • 13

Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

Paper • 2603.19235 • Published 13 days ago • 93

upvoted 2 papers 14 days ago

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Paper • 2603.18004 • Published 14 days ago • 12

MosaicMem: Hybrid Spatial Memory for Controllable Video World Models

Paper • 2603.17117 • Published 15 days ago • 87

upvoted a paper 15 days ago

MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation

Paper • 2603.16861 • Published 15 days ago • 9

Yuseung "Phillip" Lee

AI & ML interests

Recent Activity

Organizations

phillipinseoul's activity