A searchable list of some of my publications is below. You can also access my publications from the following sites.
My ORCID is
Publications:
Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang
Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation (inproceedings) Proceedings Article
In: Proceedings of European Conference on Computer Vision (ECCV) , 2024.
Abstract | Links | BibTeX | Tags: arXiv, computer vision, ECCV, generative AI, google, reinforcement learning
@inproceedings{2024-Lee-PPMRLFTG,
title = {Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation (inproceedings)},
author = {Seung Hyun Lee and Yinxiao Li and Junjie Ke and Innfarn Yoo and Han Zhang and Jiahui Yu and Qifei Wang and Fei Deng and Glenn Entis and Junfeng He and Gang Li and Sangpil Kim and Irfan Essa and Feng Yang
},
url = {https://arxiv.org/abs/2401.05675
https://arxiv.org/pdf/2401.05675
https://dl.acm.org/doi/10.1007/978-3-031-72920-1_26},
doi = {10.48550/arXiv.2401.05675},
year = {2024},
date = {2024-07-25},
urldate = {2024-07-25},
booktitle = {Proceedings of European Conference on Computer Vision (ECCV)
},
abstract = {Recent works have demonstrated that using reinforcement learning (RL) with multiple quality rewards can improve the quality of generated images in text-to-image (T2I) generation. However, manually adjusting reward weights poses challenges and may cause over-optimization in certain metrics. To solve this, we propose Parrot, which addresses the issue through multi-objective optimization and introduces an effective multi-reward optimization strategy to approximate Pareto optimal. Utilizing batch-wise Pareto optimal selection, Parrot automatically identifies the optimal trade-off among different rewards. We use the novel multi-reward optimization algorithm to jointly optimize the T2I model and a prompt expansion network, resulting in significant improvement of image quality and also allow to control the trade-off of different rewards using a reward related prompt during inference. Furthermore, we introduce original prompt-centered guidance at inference time, ensuring fidelity to user input after prompt expansion. Extensive experiments and a user study validate the superiority of Parrot over several baselines across various quality criteria, including aesthetics, human preference, text-image alignment, and image sentiment.
},
keywords = {arXiv, computer vision, ECCV, generative AI, google, reinforcement learning},
pubstate = {published},
tppubtype = {inproceedings}
}
K. Niranjan Kumar, Irfan Essa, Sehoon Ha
Cascaded Compositional Residual Learning for Complex Interactive Behaviors Journal Article
In: IEEE Robotics and Automation Letters, vol. 8, iss. 8, pp. 4601–4608, 2023.
Abstract | Links | BibTeX | Tags: IEEE, reinforcement learning, robotics
@article{2023-Kumar-CCRLCIB,
title = {Cascaded Compositional Residual Learning for Complex Interactive Behaviors},
author = {K. Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://ieeexplore.ieee.org/document/10152471},
doi = {10.1109/LRA.2023.3286171},
year = {2023},
date = {2023-06-14},
urldate = {2023-06-14},
journal = {IEEE Robotics and Automation Letters},
volume = {8},
issue = {8},
pages = {4601--4608},
abstract = {Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework combines multiple levels of pre-learned skills by using multiplicative skill composition and residual action learning. We also introduce a goal synthesis network and an observation selector to support combination of heterogeneous skills, each with its unique goals and observation space. Finally, we develop residual regularization for learning policies that solve a new task, while preserving the style of the motion enforced by the skill library. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, and successfully transfer to a real Unitree A1 robot without any additional fine-tuning.},
keywords = {IEEE, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {article}
}
Erik Wijmans, Irfan Essa, Dhruv Batra
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement Proceedings Article
In: Oh, Alice H., Agarwal, Alekh, Belgrave, Danielle, Cho, Kyunghyun (Ed.): Advances in Neural Information Processing Systems (NeurIPS), 2022.
Abstract | Links | BibTeX | Tags: machine learning, NeurIPS, reinforcement learning, robotics
@inproceedings{2022-Wijmans-SOLENER,
title = {VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement},
author = {Erik Wijmans and Irfan Essa and Dhruv Batra},
editor = {Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
url = {https://arxiv.org/abs/2210.05064
https://openreview.net/forum?id=VrJWseIN98},
doi = {10.48550/ARXIV.2210.05064},
year = {2022},
date = {2022-12-01},
urldate = {2022-12-01},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
abstract = {We present Variable Experience Rollout (VER), a technique for efficiently scaling batched on-policy reinforcement learning in heterogenous environments (where different environments take vastly different times to generate rollouts) to many GPUs residing on, potentially, many machines. VER combines the strengths of and blurs the line between synchronous and asynchronous on-policy RL methods (SyncOnRL and AsyncOnRL, respectively). Specifically, it learns from on-policy experience (like SyncOnRL) and has no synchronization points (like AsyncOnRL) enabling high throughput.
We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency.
We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.},
keywords = {machine learning, NeurIPS, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency.
We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.
Niranjan Kumar, Irfan Essa, Sehoon Ha
Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning Proceedings Article
In: Proceedings International Conference on Robotics and Automation (ICRA), pp. 7521-7527, 2022.
Abstract | Links | BibTeX | Tags: ICRA, machine learning, reinforcement learning, robotics
@inproceedings{2021-Kumar-GCSGIEUDRL,
title = {Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://doi.org/10.1109/ICRA46639.2022.9811874
https://arxiv.org/abs/2109.10460
https://arxiv.org/pdf/2109.10460
https://www.kniranjankumar.com/projects/5_clutr
https://kniranjankumar.github.io/assets/pdf/graph_based_clutter.pdf
https://youtu.be/T2Jo7wwaXss},
doi = {10.1109/ICRA46639.2022.9811874},
year = {2022},
date = {2022-05-01},
urldate = {2022-05-01},
booktitle = {Proceedings International Conference on Robotics and Automation (ICRA)},
journal = {arXiv},
number = {2109.10460},
pages = {7521-7527},
abstract = {We introduce a novel method to teach a robotic agent to interactively explore cluttered yet structured scenes, such as kitchen pantries and grocery shelves, by leveraging the physical plausibility of the scene. We propose a novel learning framework to train an effective scene exploration policy to discover hidden objects with minimal interactions. First, we define a novel scene grammar to represent structured clutter. Then we train a Graph Neural Network (GNN) based Scene Generation agent using deep reinforcement learning (deep RL), to manipulate this Scene Grammar to create a diverse set of stable scenes, each containing multiple hidden objects. Given such cluttered scenes, we then train a Scene Exploration agent, using deep RL, to uncover hidden objects by interactively rearranging the scene.
},
keywords = {ICRA, machine learning, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Niranjan Kumar, Irfan Essa, Sehoon Ha
Cascaded Compositional Residual Learning for Complex Interactive Behaviors Proceedings Article
In: Sim-to-Real Robot Learning: Locomotion and Beyond Workshop at the Conference on Robot Learning (CoRL), arXiv, 2022.
Abstract | Links | BibTeX | Tags: reinforcement learning, robotics
@inproceedings{2022-Kumar-CCRLCIB,
title = {Cascaded Compositional Residual Learning for Complex Interactive Behaviors},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://arxiv.org/abs/2212.08954
https://www.kniranjankumar.com/ccrl/static/pdf/paper.pdf
https://youtu.be/fAklIxiK7Qg
},
doi = {10.48550/ARXIV.2212.08954},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Sim-to-Real Robot Learning: Locomotion and Beyond Workshop at the Conference on Robot Learning (CoRL)},
publisher = {arXiv},
abstract = {Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework learns multiplicative policy composition, task-specific residual actions, and synthetic goal information simultaneously while freezing the prerequisite policies. We further explicitly control the style of the motion by regularizing residual actions. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, which we successfully transfer to a real Unitree A1 robot without any additional fine-tuning.},
keywords = {reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Niranjan Kumar, Irfan Essa, Sehoon Ha, C. Karen Liu
Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation Proceedings Article
In: Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning, NeurIPS 2020.
Abstract | Links | BibTeX | Tags: reinforcement learning, robotics
@inproceedings{2020-Kumar-EMDAOTNM,
title = {Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha and C. Karen Liu},
url = {https://orlrworkshop.github.io/program/orlr_25.html
http://arxiv.org/abs/1907.03964
https://www.kniranjankumar.com/projects/1_mass_prediction
https://www.youtube.com/watch?v=o3zBdVWvWZw
https://kniranjankumar.github.io/assets/pdf/Estimating_Mass_Distribution_of_Articulated_Objects_using_Non_prehensile_Manipulation.pdf},
year = {2020},
date = {2020-12-01},
urldate = {2020-12-01},
booktitle = {Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning},
organization = {NeurIPS},
abstract = {We explore the problem of estimating the mass distribution of an articulated object by an interactive robotic agent. Our method predicts the mass distribution of an object by using limited sensing and actuating capabilities of a robotic agent that is interacting with the object. We are inspired by the role of exploratory play in human infants. We take the combined approach of supervised and reinforcement learning to train an agent that learns to strategically interact with the object to estimate the object's mass distribution. Our method consists of two neural networks: (i) the policy network which decides how to interact with the object, and (ii) the predictor network that estimates the mass distribution given a history of observations and interactions. Using our method, we train a robotic arm to estimate the mass distribution of an object with moving parts (e.g. an articulated rigid body system) by pushing it on a surface with unknown friction properties. We also demonstrate how our training from simulations can be transferred to real hardware using a small amount of real-world data for fine-tuning. We use a UR10 robot to interact with 3D printed articulated chains with varying mass distributions and show that our method significantly outperforms the baseline system that uses random pushes to interact with the object.},
howpublished = {arXiv preprint arXiv:1907.03964},
keywords = {reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
A. Schödl, I. Essa
Machine Learning for Video-Based Rendering. Proceedings Article
In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1002-1008, 2000.
Links | BibTeX | Tags: computer animation, reinforcement learning, video textures
@inproceedings{2000-Schodl-MLVR,
title = {Machine Learning for Video-Based Rendering.},
author = {A. Schödl and I. Essa},
url = {https://www.cc.gatech.edu/cpl/projects/videotexture/NIPS2000/index.html},
year = {2000},
date = {2000-12-01},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
pages = {1002-1008},
keywords = {computer animation, reinforcement learning, video textures},
pubstate = {published},
tppubtype = {inproceedings}
}
Other Publication Sites
A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.
Copyright/About
[Please see the Copyright Statement that may apply to the content listed here.]
This list of publications is produced by using the teachPress plugin for WordPress.