Paper Abstract The massive growth of sports videos has resulted in a need for automatic generation of sports highlights that are comparable in quality to the hand-edited highlights produced by broadcasters such as ESPN. Unlike previous works that mostly use audio-visual cues derived from the video, we propose an approach that additionally leverages contextual cues […]
Paper in ECCV Workshop 2012: “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos”
Citation Abstract We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Especially, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as dog”, without employing any pre-trained object detectors. We formulate this problem as […]
Paper: ACM Multimedia (2008) "Audio Puzzler: Piecing Together Time-Stamped Speech Transcripts with a Puzzle Game"
N. Diakopoulos, K. Luther, I. Essa (2008), “Audio Puzzler: Piecing Together Time-Stamped Speech Transcripts with a Puzzle Game.” In Proceedings of ACM International Conference on Multimedia 2008. Vancouver, BC, CANANDA [Project Link] ABSTRACT We have developed an audio-based casual puzzle game which produces a time-stamped transcription of spokenaudio as a by-product of play. Our evaluation […]