Publications

A searchable list of some of my publications is below. You can also access my publications from the following sites.

My ORCID is ORCID iD icon

Publications:

Harish Haresamudram, Irfan Essa, Thomas Plötz

A Washing Machine is All You Need? On the Feasibility of Machine Data for Self-Supervised Human Activity Recognition Proceedings Article

In: International Conference on Activity and Behavior Computing (ABC) 2024 , 2024.

Abstract | Links | BibTeX | Tags: activity recognition, behavioral imaging, wearable computing

@inproceedings{2024-Haresamudram-WMNFMDSHAR,

title = {A Washing Machine is All You Need? On the Feasibility of Machine Data for Self-Supervised Human Activity Recognition},

author = {Harish Haresamudram and Irfan Essa and Thomas Plötz

},

url = {https://ieeexplore.ieee.org/abstract/document/10651688},

doi = {10.1109/ABC61795.2024.10651688},

year  = {2024},

date = {2024-05-24},

booktitle = {International Conference on Activity and Behavior Computing (ABC) 2024 },

abstract = {Learning representations via self-supervision has emerged as a powerful framework for deriving features for automatically recognizing activities using wearables. The current de-facto protocol involves performing pre-training on (large-scale) data recorded from human participants. This requires effort as recruiting participants and subsequently collecting data is both expensive and time-consuming. In this paper, we investigate the feasibility of an alternate source of data for its suitability to lead to useful representations, one that requires substantially lower effort for data collection. Specifically, we examine whether data collected by affixing sensors on running machinery, i.e., recording non-human movements/vibrations can also be utilized for self-supervised human activity recognition. We perform an extensive evaluation of utilizing data collected on a washing machine as the source and observe that state-of-the-art methods perform surprisingly well relative to when utilizing large-scale human movement data, obtaining within 5-6 % Fl-score on some target datasets, and exceeding on others. In scenarios with limited access to annotations, models trained on the washing-machine data perform comparably or better than end-to-end training, thereby indicating their feasibility and potential for recognizing activities. These results are significant and promising because they have the potential to substantially lower the efforts necessary for deriving effective wearables-based human activity recognition systems.

},

keywords = {activity recognition, behavioral imaging, wearable computing},

pubstate = {published},

tppubtype = {inproceedings}

}

Edison Thomaz, Cheng Zhang, Irfan Essa, Gregory Abowd

Inferring Meal Eating Activities in Real World Settings from Ambient Sounds: A Feasibility Study Best Paper Proceedings Article

In: ACM Conference on Intelligence User Interfaces (IUI), 2015.

Abstract | Links | BibTeX | Tags: ACM, activity recognition, AI, awards, behavioral imaging, best paper award, computational health, IUI, machine learning

Jonathan Bidwell, Irfan Essa, Agata Rozga, Gregory Abowd

Measuring child visual attention using markerless head tracking from color and depth sensing cameras Proceedings Article

In: Proceedings of International Conference on Multimodal Interfaces (ICMI), 2014.

Abstract | Links | BibTeX | Tags: autism, behavioral imaging, computer vision, ICMI

Jonathan Bidwell, Agata Rozga, J. Kim, H. Rao, Mark Clements, Irfan Essa, Gregory Abowd

Automated Prediction of a Child's Response to Name from Audio and Video Proceedings Article

In: Proceedings of Annual Conference of the International Society of Autism Research, IMFAR 2014.

Abstract | Links | BibTeX | Tags: autism, behavioral imaging, computational health

@inproceedings{2014-Bidwell-APCRNFAV,

title = {Automated Prediction of a Child's Response to Name from Audio and Video},

author = {Jonathan Bidwell and Agata Rozga and J. Kim and H. Rao and Mark Clements and Irfan Essa and Gregory Abowd},

url = {https://imfar.confex.com/imfar/2014/webprogram/Paper16999.html

https://www.researchgate.net/publication/268143304_Automated_Prediction_of_a_Child's_Response_to_Name_from_Audio_and_Video},

year  = {2014},

date = {2014-05-01},

urldate = {2014-05-01},

booktitle = {Proceedings of Annual Conference of the International Society of Autism Research},

organization = {IMFAR},

abstract = {Evidence has shown that a child’s failure to respond to name is an early warning sign for autism and is measured as a part of standard assessments e.g. ADOS [1,2]. Objectives: Build a fully automated system for measuring a child’s response to his or her name being called given video and recorded audio during a social interaction. Here our initial goal is to enable this measurement in a naturalistic setting with the long term goal of eventually obtaining finer gain behavior measurements such as child response time latency between a name call and a response. Methods: We recorded 40 social interactions between an examiner and children (ages 15-24 months). 6 of our 40 child participants showed signs of developmental delay based on standardized parent report measures (M-CHAT, CSBS-ITC, CBCL language development survey). The child sat at a table with a toy to play with. The examiner wore a lapel microphone and called the child’s name up to 3 times while standing to the right and slightly behind the child. These interactions were recorded with two cameras that we used in conjunction with the examiner’s audio for predicting when the child responded. Name calls were measured by 1) detecting when an examiner called the child’s name and 2) evaluating whether the child turned to make eye contact with the examiner. Examiner name calls were detected using a speech detection algorithm. Meanwhile the child’s head turns were tracked using a pair of cameras which consisted of overhead Kinect color and depth camera and a front facing color camera. These speech and head turn measurements were used to train a binary classifier for automatically predicting if and when a child responds to his or her name being called. The result is a system for predicting the child’s response to his or her name being called automatically recorded audio and video of the session. Results: The system was evaluated against human coding of the child’s response to name from video. If the automated prediction fell within +/- 1 second of the human coded response then we recorded a match. Across our 40 sessions we had 56 name calls, 35 responses and 5 children that did not respond to name. Our software correctly predicted children’s response to name with a precision of 90%, recall of 85%.},

keywords = {autism, behavioral imaging, computational health},

pubstate = {published},

tppubtype = {inproceedings}

}

Evidence has shown that a child’s failure to respond to name is an early warning sign for autism and is measured as a part of standard assessments e.g. ADOS [1,2]. Objectives: Build a fully automated system for measuring a child’s response to his or her name being called given video and recorded audio during a social interaction. Here our initial goal is to enable this measurement in a naturalistic setting with the long term goal of eventually obtaining finer gain behavior measurements such as child response time latency between a name call and a response. Methods: We recorded 40 social interactions between an examiner and children (ages 15-24 months). 6 of our 40 child participants showed signs of developmental delay based on standardized parent report measures (M-CHAT, CSBS-ITC, CBCL language development survey). The child sat at a table with a toy to play with. The examiner wore a lapel microphone and called the child’s name up to 3 times while standing to the right and slightly behind the child. These interactions were recorded with two cameras that we used in conjunction with the examiner’s audio for predicting when the child responded. Name calls were measured by 1) detecting when an examiner called the child’s name and 2) evaluating whether the child turned to make eye contact with the examiner. Examiner name calls were detected using a speech detection algorithm. Meanwhile the child’s head turns were tracked using a pair of cameras which consisted of overhead Kinect color and depth camera and a front facing color camera. These speech and head turn measurements were used to train a binary classifier for automatically predicting if and when a child responds to his or her name being called. The result is a system for predicting the child’s response to his or her name being called automatically recorded audio and video of the session. Results: The system was evaluated against human coding of the child’s response to name from video. If the automated prediction fell within +/- 1 second of the human coded response then we recorded a match. Across our 40 sessions we had 56 name calls, 35 responses and 5 children that did not respond to name. Our software correctly predicted children’s response to name with a precision of 90%, recall of 85%.

James Rehg, Gregory Abowd, Agata Rozga, Mario Romero, Mark Clements, Stan Sclaroff, Irfan Essa, Opal Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye

Decoding Children's Social Behavior Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society 2013, ISBN: 1063-6919.

Abstract | Links | BibTeX | Tags: autism, behavioral imaging, computational health, computer vision, CVPR