Paper in IJCARS (2016) on “Automated video-based assessment of surgical skills for training and evaluation in medical schools”


  • A. Zia, Y. Sharma, V. Bettadapura, E. L. Sarin, T. Ploetz, M. A. Clements, and I. Essa (2016), “Automated video-based assessment of surgical skills for training and evaluation in medical schools,” International Journal of Computer Assisted Radiology and Surgery, vol. 11, iss. 9, p. 1623–1636, 2016. [WEBSITE] [DOI] [BIBTEX]
    @Article{ 2016-Zia-AVASSTEMS,
    author  = {Zia, Aneeq and Sharma, Yachna and Bettadapura,
    Vinay and Sarin, Eric L and Ploetz, Thomas and
    Clements, Mark A and Essa, Irfan},
    doi = {10.1007/s11548-016-1468-2},
    journal  = {{International Journal of Computer Assisted
    Radiology and Surgery}},
    month = {September},
    number  = {9},
    pages = {1623--1636},
    publisher  = {Springer Berlin Heidelberg},
    title = {Automated video-based assessment of surgical skills
    for training and evaluation in medical schools},
    url = {},
    volume  = {11},
    year = {2016}


Sample frames from our video dataset

Purpose: Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in-person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches, however, are still incredibly time-consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities.
Method : We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. Results: We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill-relevant information in surgical videos.
Conclusion: Our evaluations show that frequency features perform better than motion texture features, which in turn perform better than symbol/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Previous article

Fall 2016 Teaching

Next article

20 years at GA Tech