Md Selim Sarowar's picture

Open to Collab

4 3

Md Selim Sarowar

selim-sarowar

·

AI & ML interests

Vision Language Action Models, World Models, 5D Robot Manipulation, 3D Computer Vision

Recent Activity

liked a dataset 14 days ago

RajatDandekar/so101_box_to_bowl_v2

liked a dataset 14 days ago

RajatDandekar/so101_box_to_bowl

authored a paper about 1 month ago

Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models

View all activity

Organizations

None yet

upvoted 4 papers about 1 month ago

GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models

Paper • 2603.09079 • Published Mar 10 • 1

Unified Vision-Language-Action Model

Paper • 2506.19850 • Published Jun 24, 2025 • 28

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 55

VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Paper • 2602.10098 • Published Feb 10 • 19