Blog

June 2019

June 17, 2019 / Last updated : March 20, 2023 irfan CVPR

Paper in CVPR 2019 on “Embodied Question Answering in Photorealistic Environments with Point Cloud Perception”

Abstract To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several […]

June 17, 2019 / Last updated : March 20, 2023 irfan CVPR

Paper in CVPR 2019 on “Audio visual scene-aware dialog”

Abstract We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the scene and the history of previous turns in the dialog. To answer successfully, agents must ground concepts from the question in the video while leveraging […]

PAGE TOP