Paper in UIST 2021 on “Automatic Instructional Video Creation from a Markdown-formatted Tutorial”


We introduce HowToCut, an automatic approach that converts a Markdown-formatted tutorial into an interactive video that presents the visual instructions with a synthesized voiceover for narration. HowToCut extracts instructional content from a multimedia document that describes a step-by-step procedure. Our method selects and converts text instructions to a voiceover. It makes automatic editing decisions to align the narration with edited visual assets, including step images, videos, and text overlays. We derive our video editing strategies from an analysis of 125 web tutorials and apply Computer Vision techniques to the assets. To enable viewers to interactively navigate the tutorial, HowToCut’s conversational UI presents instructions in multiple formats upon user commands. We evaluated our automatically-generated video tutorials through user studies (N=20) and validated the video quality via an online survey (N=93). The evaluation shows that our method was able to effectively create informative and useful instructional videos from a web tutorial document for both reviewing and following.

Paper / Citation

  • P. Chi, N. Frey, K. Panovich, and I. Essa (2021), “Automatic Instructional Video Creation from a Markdown-Formatted Tutorial,” in ACM Symposium on User Interface Software and Technology (UIST), 2021. [PDF] [WEBSITE] [BIBTEX]
    @InProceedings{ 2021-Chi-AIVCFMT,
    author  = {Peggy Chi and Nathan Frey and Katrina Panovich and
    Irfan Essa},
    booktitle  = {{ACM Symposium on User Interface Software and
    Technology (UIST)}},
    month = {October},
    pdf = {},
    publisher  = {ACM Press},
    title = {Automatic Instructional Video Creation from a
    Markdown-Formatted Tutorial},
    url = {},
    year = {2021}


A short 30-minute preview of the Paper.

Additional Information

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.