Publications

A searchable list of some of my publications is below. You can also access my publications from the following sites.

My ORCID is ORCID iD icon

https://orcid.org/0000-0002-6236-2969

Publications:

Show all

24 entries « ‹ 1 of 2 › »

Gong Zhang, Kihyuk Sohn, Meera Hahn, Humphrey Shi, Irfan Essa

FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models Proceedings Article

In: Advances in Neural Information Processing Systems (NeurIPS), 2024.

Abstract | Links | BibTeX | Tags: computer vision, generative AI, generative media, machine learning, NeurIPS

Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos, Dhruv Batra

Emergence of Maps in the Memories of Blind Navigation Agents Best Paper Proceedings Article

In: Proceedings of International Conference on Learning Representations (ICLR), 2023.

Abstract | Links | BibTeX | Tags: awards, best paper award, computer vision, google, ICLR, machine learning, robotics

@inproceedings{2023-Wijmans-EMMBNA,

title = {Emergence of Maps in the Memories of Blind Navigation Agents},

author = {Erik Wijmans and Manolis Savva and Irfan Essa and Stefan Lee and Ari S. Morcos and Dhruv Batra},

url = {https://arxiv.org/abs/2301.13261

https://wijmans.xyz/publication/eom/

https://openreview.net/forum?id=lTt4KjHSsyl

https://blog.iclr.cc/2023/03/21/announcing-the-iclr-2023-outstanding-paper-award-recipients/},

doi = {10.48550/ARXIV.2301.13261},

year  = {2023},

date = {2023-05-01},

urldate = {2023-05-01},

booktitle = {Proceedings of International Conference on Learning Representations (ICLR)},

abstract = {Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental') maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Unlike animal navigation, we can judiciously design the agent's perceptual system and control the learning paradigm to nullify alternative navigation mechanisms. Specifically, we train 'blind' agents -- with sensing limited to only egomotion and no other sensing of any kind -- to perform PointGoal navigation ('go to Δ x, Δ y') via reinforcement learning. Our agents are composed of navigation-agnostic components (fully-connected and recurrent neural networks), and our experimental setup provides no inductive bias towards mapping. Despite these harsh conditions, we find that blind agents are (1) surprisingly effective navigators in new environments (~95% success); (2) they utilize memory over long horizons (remembering ~1,000 steps of past experience in an episode); (3) this memory enables them to exhibit intelligent behavior (following walls, detecting collisions, taking shortcuts); (4) there is emergence of maps and collision detection neurons in the representations of the environment built by a blind agent as it navigates; and (5) the emergent maps are selective and task dependent (e.g. the agent 'forgets' exploratory detours). Overall, this paper presents no new techniques for the AI audience, but a surprising finding, an insight, and an explanation.},

keywords = {awards, best paper award, computer vision, google, ICLR, machine learning, robotics},

pubstate = {published},

tppubtype = {inproceedings}

}

José Lezama, Tim Salimans, Lu Jiang, Huiwen Chang, Jonathan Ho, Irfan Essa

Discrete Predictor-Corrector Diffusion Models for Image Synthesis Proceedings Article

In: International Conference on Learning Representations (ICLR), 2023.

Abstract | Links | BibTeX | Tags: computer vision, generative AI, generative media, google, ICLR, machine learning

Erik Wijmans, Irfan Essa, Dhruv Batra

VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement Proceedings Article

In: Oh, Alice H., Agarwal, Alekh, Belgrave, Danielle, Cho, Kyunghyun (Ed.): Advances in Neural Information Processing Systems (NeurIPS), 2022.

Abstract | Links | BibTeX | Tags: machine learning, NeurIPS, reinforcement learning, robotics

@inproceedings{2022-Wijmans-SOLENER,

title = {VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement},

author = {Erik Wijmans and Irfan Essa and Dhruv Batra},

editor = {Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},

url = {https://arxiv.org/abs/2210.05064

https://openreview.net/forum?id=VrJWseIN98},

doi = {10.48550/ARXIV.2210.05064},

year  = {2022},

date = {2022-12-01},

urldate = {2022-12-01},

booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},

abstract = {We present Variable Experience Rollout (VER), a technique for efficiently scaling batched on-policy reinforcement learning in heterogenous environments (where different environments take vastly different times to generate rollouts) to many GPUs residing on, potentially, many machines. VER combines the strengths of and blurs the line between synchronous and asynchronous on-policy RL methods (SyncOnRL and AsyncOnRL, respectively). Specifically, it learns from on-policy experience (like SyncOnRL) and has no synchronization points (like AsyncOnRL) enabling high throughput. 

 

We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency. 

 

We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.},

keywords = {machine learning, NeurIPS, reinforcement learning, robotics},

pubstate = {published},

tppubtype = {inproceedings}

}

Niranjan Kumar, Irfan Essa, Sehoon Ha

Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning Proceedings Article

In: Proceedings International Conference on Robotics and Automation (ICRA), pp. 7521-7527, 2022.

Abstract | Links | BibTeX | Tags: ICRA, machine learning, reinforcement learning, robotics

Karan Samel, Zelin Zhao, Binghong Chen, Shuang Li, Dharmashankar Subramanian, Irfan Essa, Le Song

Learning Temporal Rules from Noisy Timeseries Data Journal Article

In: arXiv preprint arXiv:2202.05403, 2022.

Abstract | Links | BibTeX | Tags: activity recognition, machine learning

Chengzhi Mao, Lu Jiang, Mostafa Dehghani, Carl Vondrick, Rahul Sukthankar, Irfan Essa

Discrete Representations Strengthen Vision Transformer Robustness Proceedings Article

In: Proceedings of International Conference on Learning Representations (ICLR), 2022.

Abstract | Links | BibTeX | Tags: computer vision, google, machine learning, vision transformer

Steven Hickson, Karthik Raveendran, Irfan Essa

Sharing Decoders: Network Fission for Multi-Task Pixel Prediction Proceedings Article

In: IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3771–3780, 2022.

Abstract | Links | BibTeX | Tags: computer vision, google, machine learning

Karan Samel, Zelin Zhao, Binghong Chen, Shuang Li, Dharmashankar Subramanian, Irfan Essa, Le Song

Neural Temporal Logic Programming Technical Report

2021.

Abstract | Links | BibTeX | Tags: activity recognition, arXiv, machine learning, openreview

10.

Harish Haresamudram, Irfan Essa, Thomas Ploetz

Contrastive Predictive Coding for Human Activity Recognition Journal Article

In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 5, no. 2, pp. 1–26, 2021.

Abstract | Links | BibTeX | Tags: activity recognition, IMWUT, machine learning, ubiquitous computing

11.

Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, Thomas Plötz

Masked reconstruction based self-supervision for human activity recognition Proceedings Article

In: Proceedings of the International Symposium on Wearable Computers (ISWC), pp. 45–49, 2020.

Abstract | Links | BibTeX | Tags: activity recognition, ISWC, machine learning, wearable computing

12.

Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar

Category learning neural networks Patent

2020.

Abstract | Links | BibTeX | Tags: google, machine learning, patents

13.

Unaiza Ahsan, Rishi Madhok, Irfan Essa

Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition Proceedings Article

In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 179-189, 2019, ISSN: 1550-5790.

Links | BibTeX | Tags: activity recognition, computer vision, machine learning, WACV

14.

Unaiza Ahsan, Rishi Madhok, Irfan Essa

Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition Journal Article

In: arXiv, no. arXiv:1808.07507, 2018.

BibTeX | Tags: activity recognition, computer vision, machine learning

15.

Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar

Object category learning and retrieval with weak supervision Technical Report

no. arXiv:1801.08985, 2018.

Abstract | Links | BibTeX | Tags: arXiv, computer vision, machine learning, object detection

16.

Unaiza Ahsan, Chen Sun, Irfan Essa

DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks Journal Article

In: arXiv, no. arXiv:1801.07230, 2018.

BibTeX | Tags: activity recognition, computer vision, machine learning

17.

Unaiza Ahsan, Munmun De Choudhury, Irfan Essa

Towards Using Visual Attributes to Infer Image Sentiment Of Social Events Proceedings Article

In: Proceedings of The International Joint Conference on Neural Networks, International Neural Network Society, Anchorage, Alaska, US, 2017.

Abstract | Links | BibTeX | Tags: computational journalism, computer vision, IJNN, machine learning

18.

Unaiza Ahsan, Chen Sun, James Hays, Irfan Essa

Complex Event Recognition from Images with Few Training Examples Proceedings Article

In: IEEE Winter Conference on Applications of Computer Vision (WACV), 2017.

Abstract | Links | BibTeX | Tags: activity recognition, computer vision, machine learning, WACV

19.

Daniel Castro, Steven Hickson, Vinay Bettadapura, Edison Thomaz, Gregory Abowd, Henrik Christensen, Irfan Essa

Predicting Daily Activities from Egocentric Images Using Deep Learning Proceedings Article

In: Proceedings of International Symposium on Wearable Computers (ISWC), 2015.

Abstract | Links | BibTeX | Tags: activity recognition, computer vision, ISWC, machine learning, wearable computing

20.

Edison Thomaz, Irfan Essa, Gregory Abowd

A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing Proceedings Article

In: ACM International Conference on Ubiquitous Computing (UBICOMP), 2015.

Abstract | Links | BibTeX | Tags: activity recognition, computational health, machine learning, Ubicomp, ubiquitous computing

24 entries « ‹ 1 of 2 › »

Other Publication Sites

A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.

Copyright/About

[Please see the Copyright Statement that may apply to the content listed here.]

This list of publications is produced by using the teachPress plugin for WordPress.

Publications:

Other Publication Sites

Copyright/About

Leave a Reply Cancel reply