A searchable list of some of my publications is below. You can also access my publications from the following sites.
My ORCID is
Publications:
Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models Proceedings Article
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 8682–8692, 2024.
Abstract | Links | BibTeX | Tags: arXiv, computer vision, CVPR, generative AI
@inproceedings{2024-Xu-PDTTTDM,
title = {Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models},
author = {Xingqian Xu and Jiayi Guo and Zhangyang Wang and Gao Huang and Irfan Essa and Humphrey Shi
},
url = {https://openaccess.thecvf.com/content/CVPR2024/papers/Xu_Prompt-Free_Diffusion_Taking_Text_out_of_Text-to-Image_Diffusion_Models_CVPR_2024_paper.pdf
https://openaccess.thecvf.com/content/CVPR2024/html/Xu_Prompt-Free_Diffusion_Taking_Text_out_of_Text-to-Image_Diffusion_Models_CVPR_2024_paper.html
https://arxiv.org/abs/2305.16223
},
doi = {10.48550/arXiv.2305.16223},
year = {2024},
date = {2024-06-18},
urldate = {2024-06-18},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
},
pages = {8682--8692},
abstract = {Text-to-image (T2I) research has grown explosively in the past year owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches. Yet one pain point persists: the text prompt engineering and searching high-quality text prompts for customized results is more art than science. Moreover as commonly argued: "an image is worth a thousand words" - the attempt to describe a desired image with texts often ends up being ambiguous and cannot comprehensively cover delicate visual details hence necessitating more additional controls from the visual domain. In this paper we take a bold step forward: taking "Text" out of a pretrained T2I diffusion model to reduce the burdensome prompt engineering efforts for users. Our proposed framework Prompt-Free Diffusion relies on only visual inputs to generate new images: it takes a reference image as "context" an optional image structural conditioning and an initial noise with absolutely no text prompt. The core architecture behind the scene is Semantic Context Encoder (SeeCoder) substituting the commonly used CLIP-based or LLM-based text encoder. The reusability of SeeCoder also makes it a convenient drop-in component: one can also pre-train a SeeCoder in one T2I model and reuse it for another. Through extensive experiments Prompt-Free Diffusion is experimentally found to (i) outperform prior exemplar-based image synthesis approaches; (ii) perform on par with state-of-the-art T2I models using prompts following the best practice; and (iii) be naturally extensible to other downstream applications such as anime figure generation and virtual try-on with promising quality. Our code and models will be open-sourced.
},
keywords = {arXiv, computer vision, CVPR, generative AI},
pubstate = {published},
tppubtype = {inproceedings}
}
Dina Bashkirova, José Lezama, Kihyuk Sohn, Kate Saenko, Irfan Essa
MaskSketch: Unpaired Structure-guided Masked Image Generation Proceedings Article
In: IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023.
Abstract | Links | BibTeX | Tags: computer vision, CVPR, generative AI, generative media, google
@inproceedings{2023-Bashkirova-MUSMIG,
title = {MaskSketch: Unpaired Structure-guided Masked Image Generation},
author = { Dina Bashkirova and José Lezama and Kihyuk Sohn and Kate Saenko and Irfan Essa},
url = {https://arxiv.org/abs/2302.05496
https://openaccess.thecvf.com/content/CVPR2023/papers/Bashkirova_MaskSketch_Unpaired_Structure-Guided_Masked_Image_Generation_CVPR_2023_paper.pdf
https://openaccess.thecvf.com/content/CVPR2023/supplemental/Bashkirova_MaskSketch_Unpaired_Structure-Guided_CVPR_2023_supplemental.pdf},
doi = {10.48550/ARXIV.2302.05496},
year = {2023},
date = {2023-06-01},
urldate = {2023-06-01},
booktitle = {IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)},
abstract = {Recent conditional image generation methods produce images of remarkable diversity, fidelity and realism. However, the majority of these methods allow conditioning only on labels or text prompts, which limits their level of control over the generation result. In this paper, we introduce MaskSketch, an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling. MaskSketch utilizes a pre-trained masked generative transformer, requiring no model training or paired supervision, and works with input sketches of different levels of abstraction. We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable structure-guided generation. Our results show that MaskSketch achieves high image realism and fidelity to the guiding structure. Evaluated on standard benchmark datasets, MaskSketch outperforms state-of-the-art methods for sketch-to-image translation, as well as unpaired image-to-image translation approaches.},
keywords = {computer vision, CVPR, generative AI, generative media, google},
pubstate = {published},
tppubtype = {inproceedings}
}
Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang
MAGVIT: Masked Generative Video Transformer Proceedings Article
In: IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023.
Abstract | Links | BibTeX | Tags: computational video, computer vision, CVPR, generative AI, generative media, google
@inproceedings{2023-Yu-MMGVT,
title = {MAGVIT: Masked Generative Video Transformer},
author = {Lijun Yu and Yong Cheng and Kihyuk Sohn and José Lezama and Han Zhang and Huiwen Chang and Alexander G. Hauptmann and Ming-Hsuan Yang and Yuan Hao and Irfan Essa and Lu Jiang},
url = {https://arxiv.org/abs/2212.05199
https://magvit.cs.cmu.edu/
https://openaccess.thecvf.com/content/CVPR2023/papers/Yu_MAGVIT_Masked_Generative_Video_Transformer_CVPR_2023_paper.pdf
https://openaccess.thecvf.com/content/CVPR2023/supplemental/Yu_MAGVIT_Masked_Generative_CVPR_2023_supplemental.pdf},
doi = {10.48550/ARXIV.2212.05199},
year = {2023},
date = {2023-06-01},
urldate = {2023-06-01},
booktitle = {IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)},
abstract = {We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model. We introduce a 3D tokenizer to quantize a video into spatial-temporal visual tokens and propose an embedding method for masked video token modeling to facilitate multi-task learning. We conduct extensive experiments to demonstrate the quality, efficiency, and flexibility of MAGVIT. Our experiments show that (i) MAGVIT performs favorably against state-of-the-art approaches and establishes the best-published FVD on three video generation benchmarks, including the challenging Kinetics-600. (ii) MAGVIT outperforms existing methods in inference time by two orders of magnitude against diffusion models and by 60x against autoregressive models. (iii) A single MAGVIT model supports ten diverse generation tasks and generalizes across videos from different visual domains. The source code and trained models will be released to the public at this https URL.},
keywords = {computational video, computer vision, CVPR, generative AI, generative media, google},
pubstate = {published},
tppubtype = {inproceedings}
}
Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang
Visual Prompt Tuning for Generative Transfer Learning Proceedings Article
In: IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023.
Abstract | Links | BibTeX | Tags: computer vision, CVPR, generative AI, generative media, google
@inproceedings{2022-Sohn-VPTGTL,
title = {Visual Prompt Tuning for Generative Transfer Learning},
author = {Kihyuk Sohn and Yuan Hao and José Lezama and Luisa Polania and Huiwen Chang and Han Zhang and Irfan Essa and Lu Jiang},
url = {https://arxiv.org/abs/2210.00990
https://openaccess.thecvf.com/content/CVPR2023/papers/Sohn_Visual_Prompt_Tuning_for_Generative_Transfer_Learning_CVPR_2023_paper.pdf
https://openaccess.thecvf.com/content/CVPR2023/supplemental/Sohn_Visual_Prompt_Tuning_CVPR_2023_supplemental.pdf},
doi = {10.48550/ARXIV.2210.00990},
year = {2023},
date = {2023-06-01},
urldate = {2023-06-01},
booktitle = {IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)},
abstract = {Transferring knowledge from an image synthesis model trained on a large dataset is a promising direction for learning generative image models from various domains efficiently. While previous works have studied GAN models, we present a recipe for learning vision transformers by generative knowledge transfer. We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers. To adapt to a new domain, we employ prompt tuning, which prepends learnable tokens called prompt to the image token sequence, and introduce a new prompt design for our task. We study on a variety of visual domains, including visual task adaptation benchmark~citezhai2019large, with varying amount of training images, and show effectiveness of knowledge transfer and a significantly better image generation quality over existing works.},
keywords = {computer vision, CVPR, generative AI, generative media, google},
pubstate = {published},
tppubtype = {inproceedings}
}
Nathan Frey, Peggy Chi, Weilong Yang, Irfan Essa
Automatic Style Transfer for Non-Linear Video Editing Proceedings Article
In: Proceedings of CVPR Workshop on AI for Content Creation (AICC), 2021.
Links | BibTeX | Tags: computational video, CVPR, google, video editing
@inproceedings{2021-Frey-ASTNVE,
title = {Automatic Style Transfer for Non-Linear Video Editing},
author = {Nathan Frey and Peggy Chi and Weilong Yang and Irfan Essa},
url = {https://arxiv.org/abs/2105.06988
https://research.google/pubs/pub50449/},
doi = {10.48550/arXiv.2105.06988},
year = {2021},
date = {2021-06-01},
urldate = {2021-06-01},
booktitle = {Proceedings of CVPR Workshop on AI for Content Creation (AICC)},
keywords = {computational video, CVPR, google, video editing},
pubstate = {published},
tppubtype = {inproceedings}
}
Huda Alamri, Vincent Cartillier, Abhishek Das, Jue Wang, Anoop Cherian, Irfan Essa, Dhruv Batra, Tim K. Marks, Chiori Hori, Peter Anderson, Stefan Lee, Devi Parikh
Audio Visual Scene-Aware Dialog Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Abstract | Links | BibTeX | Tags: computational video, computer vision, CVPR, embodied agents, vision & language
@inproceedings{2019-Alamri-AVSD,
title = {Audio Visual Scene-Aware Dialog},
author = {Huda Alamri and Vincent Cartillier and Abhishek Das and Jue Wang and Anoop Cherian and Irfan Essa and Dhruv Batra and Tim K. Marks and Chiori Hori and Peter Anderson and Stefan Lee and Devi Parikh},
url = {https://openaccess.thecvf.com/content_CVPR_2019/papers/Alamri_Audio_Visual_Scene-Aware_Dialog_CVPR_2019_paper.pdf
https://video-dialog.com/
https://arxiv.org/abs/1901.09107},
doi = {10.1109/CVPR.2019.00774},
year = {2019},
date = {2019-06-01},
urldate = {2019-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
abstract = {We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the scene and the history of previous turns in the dialog. To answer successfully, agents must ground concepts from the question in the video while leveraging contextual cues from the dialog history. To benchmark this task, we introduce the Audio Visual Scene-Aware Dialog (AVSD) Dataset. For each of more than 11,000 videos of human actions from the Charades dataset, our dataset contains a dialog about the video, plus a final summary of the video by one of the dialog participants. We train several baseline systems for this task and evaluate the performance of the trained models using both qualitative and quantitative metrics. Our results indicate that models must utilize all the available inputs (video, audio, question, and dialog history) to perform best on this dataset.
},
keywords = {computational video, computer vision, CVPR, embodied agents, vision & language},
pubstate = {published},
tppubtype = {inproceedings}
}
Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Links | BibTeX | Tags: computer vision, CVPR, vision & language
@inproceedings{2019-Wijmans-EQAPEWPCP,
title = {Embodied Question Answering in Photorealistic Environments With Point Cloud Perception},
author = {Erik Wijmans and Samyak Datta and Oleksandr Maksymets and Abhishek Das and Georgia Gkioxari and Stefan Lee and Irfan Essa and Devi Parikh and Dhruv Batra},
doi = {10.1109/CVPR.2019.00682},
year = {2019},
date = {2019-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
keywords = {computer vision, CVPR, vision & language},
pubstate = {published},
tppubtype = {inproceedings}
}
Unaiza Ahsan, Irfan Essa
Clustering Social Event Images Using Kernel Canonical Correlation Analysis Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop on Women in Computing (WiC), 2014.
Abstract | Links | BibTeX | Tags: activity recognition, computer vision, CVPR, machine learning
@inproceedings{2014-Ahsan-CSEIUKCCA,
title = {Clustering Social Event Images Using Kernel Canonical Correlation Analysis},
author = {Unaiza Ahsan and Irfan Essa},
url = {https://openaccess.thecvf.com/content_cvpr_workshops_2014/W20/papers/Ahsan_Clustering_Social_Event_2014_CVPR_paper.pdf
https://smartech.gatech.edu/handle/1853/53656},
year = {2014},
date = {2014-06-01},
urldate = {2014-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop on Women in Computing (WiC)},
abstract = {Sharing user experiences in form of photographs, tweets, text, audio and/or video has become commonplace in social networking websites. Browsing through large collections of social multimedia remains a cumbersome task. It requires a user to initiate textual search query and manually go through a list of resulting images to find relevant information. We propose an automatic clustering algorithm, which, given a large collection of images, groups them into clusters of different events using the image features and related metadata. We formulate this problem as a kernel canonical correlation clustering problem in which data samples from different modalities or ‘views’ are projected to a space where correlations between the samples’ projections are maximized. Our approach enables us to learn a semantic representation of potentially uncorrelated feature sets and this representation is clustered to give unique social events. Furthermore, we leverage the rich information associated with each uploaded image (such as usernames, dates/timestamps, etc.) and empirically determine which combination of feature sets yields the best clustering score for a dataset of 100,000 images.
},
keywords = {activity recognition, computer vision, CVPR, machine learning},
pubstate = {published},
tppubtype = {inproceedings}
}
Steven Hickson, Stan Birchfield, Irfan Essa, Henrik Christensen
Efficient Hierarchical Graph-Based Segmentation of RGBD Videos Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society 2014.
Links | BibTeX | Tags: computational video, computer vision, CVPR, video segmentation
@inproceedings{2014-Hickson-EHGSRV,
title = {Efficient Hierarchical Graph-Based Segmentation of RGBD Videos},
author = {Steven Hickson and Stan Birchfield and Irfan Essa and Henrik Christensen},
url = {http://www.cc.gatech.edu/cpl/projects/4dseg},
year = {2014},
date = {2014-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
organization = {IEEE Computer Society},
keywords = {computational video, computer vision, CVPR, video segmentation},
pubstate = {published},
tppubtype = {inproceedings}
}
James Rehg, Gregory Abowd, Agata Rozga, Mario Romero, Mark Clements, Stan Sclaroff, Irfan Essa, Opal Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye
Decoding Children's Social Behavior Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society 2013, ISBN: 1063-6919.
Abstract | Links | BibTeX | Tags: autism, behavioral imaging, computational health, computer vision, CVPR
@inproceedings{2013-Rehg-DCSB,
title = {Decoding Children's Social Behavior},
author = {James Rehg and Gregory Abowd and Agata Rozga and Mario Romero and Mark Clements and Stan Sclaroff and Irfan Essa and Opal Ousley and Yin Li and Chanho Kim and Hrishikesh Rao and Jonathan Kim and Liliana Lo Presti and Jianming Zhang and Denis Lantsman and Jonathan Bidwell and Zhefan Ye},
url = {https://ieeexplore.ieee.org/document/6619282
http://www.cbi.gatech.edu/mmdb/
},
doi = {10.1109/CVPR.2013.438},
isbn = {1063-6919},
year = {2013},
date = {2013-06-01},
urldate = {2013-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
organization = {IEEE Computer Society},
abstract = {We introduce a new problem domain for activity recognition: the analysis of children's social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1-2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3-5 minute child-adult interaction. In each session, the adult examiner followed a semi-structured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.
},
keywords = {autism, behavioral imaging, computational health, computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
Syed Hussain Raza, Matthias Grundmann, Irfan Essa
Geoemetric Context from Video Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society 2013.
Links | BibTeX | Tags: computational video, computer vision, CVPR, video segmentation
@inproceedings{2013-Raza-GCFV,
title = {Geoemetric Context from Video},
author = {Syed Hussain Raza and Matthias Grundmann and Irfan Essa},
url = {http://www.cc.gatech.edu/cpl/projects/videogeometriccontext/},
doi = {10.1109/CVPR.2013.396},
year = {2013},
date = {2013-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
organization = {IEEE Computer Society},
keywords = {computational video, computer vision, CVPR, video segmentation},
pubstate = {published},
tppubtype = {inproceedings}
}
Vinay Bettadapura, Grant Schindler, Thomas Ploetz, Irfan Essa
Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society 2013.
Links | BibTeX | Tags: activity recognition, computational video, computer vision, CVPR
@inproceedings{2013-Bettadapura-ABDDTSIAR,
title = {Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition},
author = {Vinay Bettadapura and Grant Schindler and Thomas Ploetz and Irfan Essa},
url = {http://www.cc.gatech.edu/cpl/projects/abow/},
doi = {10.1109/CVPR.2013.338},
year = {2013},
date = {2013-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
organization = {IEEE Computer Society},
keywords = {activity recognition, computational video, computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
Kihwan Kim, Dongreyol Lee, Irfan Essa
Detecting Regions of Interest in Dynamic Scenes with Camera Motions Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2012.
Links | BibTeX | Tags: computer vision, CVPR
@inproceedings{2012-Kim-DRIDSWCM,
title = {Detecting Regions of Interest in Dynamic Scenes with Camera Motions},
author = {Kihwan Kim and Dongreyol Lee and Irfan Essa},
url = {http://www.cc.gatech.edu/cpl/projects/roi/},
doi = {10.1109/CVPR.2012.6247809},
year = {2012},
date = {2012-01-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
publisher = {IEEE Computer Society},
keywords = {computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
M. Grundmann, V. Kwatra, I. Essa
Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2011.
Links | BibTeX | Tags: computational video, computer vision, CVPR
@inproceedings{2011-Grundmann-AVSWROCP,
title = {Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths},
author = {M. Grundmann and V. Kwatra and I. Essa},
url = {http://www.cc.gatech.edu/cpl/projects/videostabilization/},
doi = {10.1109/CVPR.2011.5995525},
year = {2011},
date = {2011-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
publisher = {IEEE Computer Society},
keywords = {computational video, computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
Raffay Hamid, Ramkrishan Kumar, Matthias Grundmann, Kihwan Kim, Irfan Essa, Jessica Hodgins
Player Localization Using Multiple Static Cameras for Sports Visualization Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society IEEE Computer Society Press, 2010.
Links | BibTeX | Tags: activity recognition, computer vision, CVPR, sports visualization
@inproceedings{2010-Hamid-PLUMSCSV,
title = {Player Localization Using Multiple Static Cameras for Sports Visualization},
author = {Raffay Hamid and Ramkrishan Kumar and Matthias Grundmann and Kihwan Kim and Irfan Essa and Jessica Hodgins},
url = {http://www.raffayhamid.com/sports_viz.shtml},
doi = {10.1109/CVPR.2010.5540142},
year = {2010},
date = {2010-06-01},
urldate = {2010-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
publisher = {IEEE Computer Society Press},
organization = {IEEE Computer Society},
keywords = {activity recognition, computer vision, CVPR, sports visualization},
pubstate = {published},
tppubtype = {inproceedings}
}
M. Grundmann, V. Kwatra, M. Han, I. Essa
Efficient Hierarchical Graph-Based Video Segmentation Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
Links | BibTeX | Tags: computational video, computer vision, CVPR, video segmentation
@inproceedings{2010-Grundmann-EHGVS,
title = {Efficient Hierarchical Graph-Based Video Segmentation},
author = {M. Grundmann and V. Kwatra and M. Han and I. Essa},
url = {http://www.cc.gatech.edu/cpl/projects/videosegmentation/},
doi = {10.1109/CVPR.2010.5539893},
year = {2010},
date = {2010-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
keywords = {computational video, computer vision, CVPR, video segmentation},
pubstate = {published},
tppubtype = {inproceedings}
}
M. Grundmann, V. Kwatra, M. Han, I. Essa
Discontinuous Seam-Carving for Video Retargeting Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2010.
Links | BibTeX | Tags: computational video, computer vision, CVPR
@inproceedings{2010-Grundmann-DSVR,
title = {Discontinuous Seam-Carving for Video Retargeting},
author = {M. Grundmann and V. Kwatra and M. Han and I. Essa},
url = {http://www.cc.gatech.edu/cpl/projects/videoretargeting/},
doi = {10.1109/CVPR.2010.5540165},
year = {2010},
date = {2010-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
publisher = {IEEE Computer Society},
keywords = {computational video, computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
K. Kim, M. Grundmann, A. Shamir, I. Matthews, J. Hodgins, I. Essa
Motion Field to Predict Play Evolution in Dynamic Sport Scenes Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
Links | BibTeX | Tags: computer vision, CVPR
@inproceedings{2010-Kim-MFPPEDSS,
title = {Motion Field to Predict Play Evolution in Dynamic Sport Scenes},
author = {K. Kim and M. Grundmann and A. Shamir and I. Matthews and J. Hodgins and I. Essa},
doi = {10.1109/CVPR.2010.5540128},
year = {2010},
date = {2010-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
keywords = {computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
P. Yin, A. Criminisi, J. Winn, I. Essa
Tree-based Classifiers for Bilayer Video Segmentation Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. pp 1–8, IEEE Computer Society, Minneapolis, MN, USA, 2007.
BibTeX | Tags: computational video, computer vision, CVPR, video segmentation
@inproceedings{2007-Yin-TCBVS,
title = {Tree-based Classifiers for Bilayer Video Segmentation},
author = {P. Yin and A. Criminisi and J. Winn and I. Essa},
year = {2007},
date = {2007-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
pages = {pp 1--8},
publisher = {IEEE Computer Society},
address = {Minneapolis, MN, USA},
keywords = {computational video, computer vision, CVPR, video segmentation},
pubstate = {published},
tppubtype = {inproceedings}
}
Y. Shi, A. Bobick, I. Essa
Learning Temporal Sequence Model from Partially Labeled Data Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1631 - 1638, IEEE Computer Society, 2006.
BibTeX | Tags: activity recognition, computational video, computer vision, CVPR
@inproceedings{2006-Shi-LTSMFPLD,
title = {Learning Temporal Sequence Model from Partially Labeled Data},
author = {Y. Shi and A. Bobick and I. Essa},
year = {2006},
date = {2006-06-01},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
pages = {1631 - 1638},
publisher = {IEEE Computer Society},
keywords = {activity recognition, computational video, computer vision, CVPR},
pubstate = {published},
tppubtype = {inproceedings}
}
Other Publication Sites
A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.
Copyright/About
[Please see the Copyright Statement that may apply to the content listed here.]
This list of publications is produced by using the teachPress plugin for WordPress.