MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model
-
etri-vilab/MultiHopSpatial-Qwen3-VL-4B-Instruct
Image-Text-to-Text β’ 4B β’ Updated β’ 20 -
etri-vilab/MultihopSpatial
Viewer β’ Updated β’ 11.3k β’ 1.44k β’ 2 -
MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model
Paper β’ 2603.18892 β’ Published β’ 1


