Skip to the content Skip to the Navigation

Irfan Essa

  • Home
  • Blog
  • Publications
  • Team
  • Videos
  • Teaching
  • FAQ
  • Contact
Blog
  1. HOME
  2. Blog
  3. April 2023

April 2023

April 23, 2023 / Last updated : August 9, 2023 irfan UIST

Paper in UIST 2023 on “Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access”

Presentation slides commonly use visual patterns for structural navigation, such as titles, dividers, and build slides. However, screen readers do not capture such intention, making it time-consuming and less accessible for blind and visually impaired (BVI) users to linearly consume slides with repeated content. We present Slide Gestalt, an automatic approach that identifies the hierarchical structure in a slide deck. Slide Gestalt computes the visual and textual correspondences between slides to generate hierarchical groupings. Readers can navigate the slide deck from the higher-level section overview to the lower-level description of a slide group or individual elements interactively with our UI. We derived side consumption and authoring practices from interviews with BVI readers and sighted creators and an analysis of 100 decks. We performed our pipeline with 50 real-world slide decks and a large dataset. Feedback from eight BVI participants showed that Slide Gestalt helped navigate a slide deck by anchoring content more efficiently, compared to using accessible slides.

Recent Posts

Sphere-WoZ
Wizard of Oz at the Las Vegas Sphere, using Google AI
June 13, 2025
CVPR 2025 paper on “Cropper: Vision-Language Model for Image Cropping through In-Context Learning”
June 13, 2025
CVPR 2025 paper on “Calibrated Multi-Preference Optimization for Aligning Diffusion Models”
June 13, 2025
Award-winning paper in ICML 2024 on “VideoPoet: A large language model for zero-shot video generation.”
July 22, 2024
ACM SIGGRAPH Seminal Graphics Papers, Volume 2. Published as part of SIGGRAPH 50th Anniversary Meeting in 2023
August 9, 2023
Paper in UIST 2023 on “Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access”
April 23, 2023
Award-winning paper in ICLR 2023 on “Emergence of Maps in the Memories of Blind Navigation Agents”
March 22, 2023
Paper in ICLR 2023 on “Discrete Predictor-Corrector Diffusion Models for Image Synthesis”
March 10, 2023
Some recent publications for 2023
March 10, 2023
Publications in 2022
December 31, 2022

Tags

ACM (20) Activity Recognition (52) Affective Computing (9) Aging-in-place (5) AI (20) Audio Analysis (9) Awards (15) Aware Home (15) Behavioral Imaging (11) Best Paper Award (11) Computational Journalism (36) Computational Photography (62) Computational Video (71) Computer Animation (10) Computer Graphics (9) Computer Vision (117) CVPR (30) DVFX (9) ECCV (5) Events (7) Faces (12) Funding (7) Generative Media (5) Gesture (6) Google (24) HCI (8) Health (7) ICCV (8) IEEE (30) Machine Learning (39) Medical (10) ML@GT (5) News (17) NSF (16) PhD Thesis (12) Presentations (28) Robotics (10) SIGGRAPH (7) Sports Visualization (6) Teaching (21) Ubiquitous Computing (5) Video Segmentation (7) Video Stabilization (14) WACV (8) Wearable Computing (9)

More about this Website

  • About
    • Tags & Categories
    • Archives
    • Copyright
    • Privacy Policy

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Copyright © Irfan Essa All Rights Reserved.

Powered by WordPress with Lightning Theme & VK All in One Expansion Unit

MENU
  • Home
  • Blog
  • Publications
  • Team
  • Videos
  • Teaching
  • FAQ
  • Contact
PAGE TOP