Skip to the content Skip to the Navigation

Irfan Essa

  • Home
  • Blog
  • Publications
  • Team
  • Videos
  • Teaching
  • FAQ
  • Contact
Blog
  1. HOME
  2. Blog
  3. Vision and Language

Vision and Language

June 17, 2019 / Last updated : September 1, 2020 irfan Computer Vision

Paper in CVPR 2019 on “Embodied Question Answering in Photorealistic Environments with Point Cloud Perception”

Abstract To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several […]

June 17, 2019 / Last updated : September 1, 2020 irfan Computer Vision

Paper in CVPR 2019 on “Audio visual scene-aware dialog”

Abstract We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the scene and the history of previous turns in the dialog. To answer successfully, agents must ground concepts from the question in the video while leveraging […]

Recent Posts

Paper in ACM UIST 2020 on “Automatic Video Creation From a Web Page”
October 28, 2020
Paper in ECCV 2020 on “Neural Design Network: Graphic Layout Generation with Constraints”
August 25, 2020
Panel of ML@GT Researchers working on Covid-19 Relief
June 24, 2020
Invited Speaker at CVPR 2020 Workshop on “AI for Content Creation”
June 15, 2020
Paper in ICLR 2020 on “DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames”
April 27, 2020
Paper in ICCV Workshop on Geometry Meets Deep Learning Workshop on “Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction”
November 2, 2019
Visit to South Korea / ICCV 2019
November 2, 2019
Newly Endowed Chair Underscores Value of Computational Journalism | College of Computing
November 1, 2019
Jeff Dean + Irfan Essa
Jeff Dean (SVP/Senior Fellow Google) at GA Tech
October 1, 2019
Paper in CVPR 2019 on “Embodied Question Answering in Photorealistic Environments with Point Cloud Perception”
June 17, 2019

Tags

ACM (21) Activity Assessment (4) Activity Recognition (49) Affective Computing (8) Aging-in-place (4) AI (6) Audio Analysis (8) Awards (11) Aware Home (15) Best Paper Award (8) Computational Journalism (42) Computational Photography (64) Computational Video (68) Computer Animation (10) Computer Graphics (9) Computer Vision (106) Crowdsourcing (8) CVPR (26) DARPA-PERSEAS (4) DVFX (11) Events (4) Faces (12) Funding (7) Gesture (6) Google (7) ICCV (8) IEEE (29) Machine Learning (28) Medical (8) MICCAI (4) News (20) NSF (17) NSF-0205507 (10) PAMI (4) PhD Thesis (15) Presentations (27) Robotics (4) SIGGRAPH (8) Sports Visualization (5) Teaching (23) Video Segmentation (7) Video Stabilization (13) Video Textures (5) WACV (7) Wearable Computing (5)

More about this Website

  • About
    • Tags & Categories
    • Archives
    • Copyright
    • Privacy Policy

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Copyright © Irfan Essa All Rights Reserved.

Powered by WordPress with Lightning Theme & VK All in One Expansion Unit by Vektor,Inc. technology.

MENU
  • Home
  • Blog
  • Publications
  • Team
  • Videos
  • Teaching
  • FAQ
  • Contact