I am a fourth year CS PhD student at the University of Maryland, College Park. I work at the GAMMA lab, under the supervision of Prof. Dinesh Manocha. My current research deals with Embodied Navigation in Social Environments, where I work on interfacing robot agents with language and visual understanding capabilities. In particular, I am interested in utilizing Large Language Models (LLMs) and Vision-Language models (VLMs) for robot decision-making, especially in generalized, few-shot, or zero-shot settings.
Prior to starting my PhD studies, I obtained a master's degree in Robotics at UMD. During this time, my primary research focus was on Social Robot Navigation, and worked with Prof. Aniket Bera.
I was a Research Fellow at the Center for Visual Information Technology, IIIT-Hyderabad under Prof. C.V. Jawahar from Sept. 2017 to April 2019. During my time there, I collaborated with Prof. A.H Abdul Hafez on a visual servoing project.
Even before that, I received my undergraduate degree in 2017 from Symbiosis International University. For my BTech. thesis, I worked under Prof. Madhura Ingalhalikar on medical image processing, where I classified brain tumor mutations from their MRI scans.
My current research interests lie in the Embodied AI domain, at the intersection of natural language processing and robotics.
When I do get time, I love to travel, hike, compose music, and play chess/topoi!. Maybe all together, someday.
Publications
- Improving Zero-Shot ObjectNav via Generative Communication Under Review at ICRA, 2025 [Paper Link ]
- Right Place, Right Time! Towards ObjectNav for Non-Stationary Goals Under Review at CV Conference [Paper Link ]
- S-EQA: Tackling Situational Queries in Embodied Question Answering Under Review at RAL, 2024 [Paper Link ]
- Can LLM’s Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis Published at NAACL Main Conference, 2024 [Paper Link ]
- Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Based Zero-Shot Object Navigation Published at RAL 2023 [Paper Link ]
- CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation Published at CoRL LangRob Workshop, 2022. [Paper Link ]
- Can a Robot Trust You? A DRL-Based Approach to Trust-Driven, Human-Guided Navigation Published at ICRA, 2021. [ Arxiv Link, Project Page ]
- ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation Published at IROS, 2020. [ Arxiv Link, Project Page ]
- A Deep Learning Approach for Autonomous Corridor Following Published at IROS, 2019. [ publication, pdf, video ]
Work Experience
- (Summer 2024) Internship at Sony Corp., where I studied the spatial reasoning capabilities of VLMs for embodied exploration and reasoning. I was supervised by Akira Nakamura, and mentored by Marzieh Edraki and Selim Engin.
- (Summer 2023) Internship at Amazon Alexa AI, where I worked on utilizing LLMs for embodied exploration and reasoning. I was supervised by Reza Ghanadhan, and mentored by Robinson Piramuthu and Prasoon Goyal.
- (Summer 2022) Internship at Amazon Alexa AI, where I worked on solving an interesting Embodied AI problem called Vision-and-Language Navigation. I was supervised by Gaurav Sukhatme, and mentored by Robinson Piramuthu, Jesse Thomason and Gunnar Sigurdsson.
- (Summer 2020) Internship at Nokia Bell Labs, where I worked on enabling Visual SLAM on an autonomous indoor Loomo robot.
- (Aug. 2019 - Present) Graduate Student at University of Maryland, College Park
- (Sept. 2017 - April 2019) Research Fellow at CVIT, IIIT-Hyderabad
Last updated July 2023