research-article

Caption-occlusion severity judgments across live-television genres from deaf and hard-of-hearing viewers

Authors:
Akhter Al Amin

Rochester Institute of Technology

Rochester Institute of Technology
View Profile

,
Saad Hassan

Rochester Institute of Technology

Rochester Institute of Technology
View Profile

,
Matt Huenerfauth

Rochester Institute of Technology

Rochester Institute of Technology
View Profile

W4A '21: Proceedings of the 18th International Web for All ConferenceApril 2021Article No.: 26Pages 1–12https://doi.org/10.1145/3430263.3452429

Published:20 May 2021Publication History

W4A '21: Proceedings of the 18th International Web for All Conference

Pages 1–12

ABSTRACT

Prior work has revealed that Deaf and Hard of Hearing (DHH) viewers are concerned about captions occluding other onscreen content, e.g. text or faces, especially for live television programming, for which captions are generally not manually placed. To support evaluation or placement of captions for several genres of live television, empirical evidence is needed on how DHH viewers prioritize onscreen information, and whether this varies by genre. Nineteen DHH participants rated the importance of various onscreen content regions across 6 genres: News, Interviews, Emergency Announcements, Political Debates, Weather News, and Sports. Importance of content regions varied significantly across several genres, motivating genre-specific caption placement. We also demonstrate how the dataset informs creation of importance-weights for a metric to predict the severity of captions occluding onscreen content. This metric correlated significantly better to 23 DHH participants' judgements of caption quality, compared to a metric with uniform importance-weights of content regions.

References

Ahmed Ali and Steve Renals. 2018. Word Error Rate Estimation for Speech Recognition: e-WER. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 20--24. Google ScholarCross Ref
Ba Tu Truong and C. Dorai. 2000. Automatic genre identification for content-based video categorization. In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, Vol. 4. 230--233 vol.4.Google ScholarCross Ref
BBC. 2019. BBC Subtitle Guidelines, 2018. https://bbc.github.io/subtitle-guidelinesGoogle Scholar
Larwan Berke, Khaled Albusays, Matthew Seita, and Matt Huenerfauth. 2019. Preferred Appearance of Captions Generated by Automatic Speech Recognition for Deaf and Hard-of-Hearing Viewers. In Extended Abstracts of the 2019 CHI Conference on Human FaWERctors in Computing Systems (Glasgow, Scotland Uk) (CHI EA '19). Association for Computing Machinery, New York, NY, USA, 1--6. Google ScholarDigital Library
Bonnie B. Blanchfield, Jacob J. Feldman, Jennifer L. Dunbar, and Eric N. Gardner. 2001. The severely to profoundly hearing-impaired population in the United States: prevalence estimates and demographics. Journal of the American Academy of Audiology 12, 4 (2001), 183--9. http://www.ncbi.nlm.nih.gov/pubmed/11332518Google ScholarCross Ref
Andy Brown, Rhia Jones, Mike Crabb, James Sandford, Matthew Brooks, Mike Armstrong, and Caroline Jay. 2015. Dynamic subtitles: The user experience. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video. 103--112.Google ScholarDigital Library
Federal Communications Commission. 2014. Closed Captioning Quality Report and Order, Declaratory Ruling, FNPRM. Retrieved from:. https://www.fcc.gov/document/closed-captioning-quality-report-and-order-declaratory-ruling-fnprmGoogle Scholar
Leon Cruickshank, Emmanuel Tsekleves, Roger Whitham, Annette Hill, and Kaoruko Kondo. 2007. Making Interactive tv Easier to Use: Interface Design for a Second Screen Approach. The Design Journal 10, 3 (2007), 41--53. arXiv:https://doi.org/10.2752/146069207789271920 Google ScholarCross Ref
S. Cushion. 2015. News and Politics: The Rise of Live and Interpretive Journalism. Taylor & Francis. https://books.google.com/books?id=d8kqBwAAQBAJGoogle Scholar
Stephen Cushion, Rachel Lewis, and Hugh Roger. 2015. Adopting or resisting 24-hour news logic on evening bulletins? The mediatization of UK television news 19912012. Journalism 16, 7 (2015), 866--883. arXiv:https://doi.org/10.1177/1464884914550975 Google ScholarCross Ref
The Described and Captioned Media Program. [n.d.]. Captioning Key for Educational Media, Guidelines and Preferred Technique. Retrieved from:. http://access-ed.r2d2.uwm.edu/resources/captioning-key.pdfGoogle Scholar
FFMPEG Developers. 2016. ffmpeg tool (Version be1d324) [Software]. http://ffmpeg.org/Google Scholar
David Fairbairn and Milad Niroumand Jadidi. 2013. Influential Visual Design Parameters on TV Weather Maps. The Cartographic Journal 50, 4 (2013), 311--323. arXiv:https://doi.org/10.1179/1743277413Y.0000000040 Google ScholarCross Ref
NFL Football Operation. Retrieved from:. [n.d.]. QUICK GUIDE TO NFL TV GRAPHICS. https://operations.nfl.com/football-101/quick-guide-to-nfl-tv-graphics/Google Scholar
Olivia Gerber-Morón, Agnieszka Szarkowska, and Bencie Woll. 2018. The impact of text segmentation on subtitle reading. Journal of Eye Movement Research 6 5 (2018).Google Scholar
Stephen R. Gulliver and Gheorghita Ghinea. 2003a. How level and type of deafness affect user perception of multimedia video clips. Inform. Soc. J. 2 2, 4 (2003a), 374--386.Google Scholar
Stephen R. Gulliver and Gheorghita Ghinea. 2003b. Impact of captions on hearing impaired and hearing perception of multimedia video clipsb. In Proceedings of the IEEE International Conference on Multimedia and Expo.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Richang Hong, Meng Wang, Xiao-Tong Yuan, Mengdi Xu, Jianguo Jiang, Shuicheng Yan, and Tat-Seng Chua. 2011. Video Accessibility Enhancement for Hearing-Impaired Users. ACM Trans. Multimedia Comput. Commun. Appl. 7S, 1, Article 24 (Nov. 2011), 19 pages. Google ScholarDigital Library
Yongtao Hu, Jan Kautz, Yizhou Yu, and Wenping Wang. 2014. Speaker-following video subtitles. ACM Transactions on Multimedia Computing, Communications, and Applications 11 2 (2014).Google Scholar
Compix Media Inc. 2019. Compix BROADCAST GRAPHICS. https://www.compix.tv/Google Scholar
Bo Jiang, Sijiang Liu, Liping He, Weimin Wu, Hongli Chen, and Yunfei Shen. 2017. Subtitle positioning for e-learning videos based on rough gaze estimation and saliency detection. In SIGGRAPH Asia Posters. 15--16.Google Scholar
Sushant Kafle and Matt Huenerfauth. 2019. Predicting the Understandability of Imperfect English Captions for People Who Are Deaf or Hard of Hearing. ACM Trans. Access. Comput. 12, 2, Article 7 June 2019), 32 pages. Google ScholarDigital Library
Sushant Kafle, Peter Yeung, and Matt Huenerfauth. 2019. Evaluating the Benefit of Highlighting Key Words in Captions for People Who Are Deaf or Hard of Hearing. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS '19). Association for Computing Machinery, New York, NY, USA, 43--55. Google ScholarDigital Library
Terry K. Koo and Mae Y. Li. 2016. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine 15, 2 (2016), 155 -- 163. Google ScholarCross Ref
Kuno Kurzhals, Emine Cetinkaya, Yongtao Hu, Wenping Wang, and Daniel Weiskopf. 2017. Close to the action: Eye-tracking evaluation of speaker-following subtitles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 6559--6568.Google ScholarDigital Library
Kuno Kurzhals, Fabian Göbel, Katrin Angerbauer, Michael Sedlmair, and Martin Raubal. 2020. A View on the Viewer: Gaze-Adaptive Captions for Videos. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--12. Google ScholarDigital Library
Raja S. Kushalnagar, Walter S. Lasecki, and Jeffrey P. Bigham. 2014. Accessibility Evaluation of Classroom Captions. ACM Trans. Access. Comput. 5, 3, Article 7 (Jan. 2014), 24 pages. Google ScholarDigital Library
DANIEL G. LEE, DEBORAH I. FELS, and JOHN PATRICK UDO. 2007. Emotive Captioning. Comput. Entertain. 5, 2, Article 11 (April 2007), 15 pages. Google ScholarDigital Library
W. Luplow and J. Kutzner. 2013. Emergency alerts to people on-the-go via terrestrial broadcasting: The M-EAS system. In 2013 IEEE International Conference on Technologies for Homeland Security (HST). 779--783.Google Scholar
Obach M, Lehr M, and Arruti A. 2007. Automatic speech recognition for live TV subtitling for hearing-impaired people. Challenges for Assistive Technology: AAATE 07 20 (2007), 286.Google Scholar
Konrad Maj and Stephan Lewandowsky. 2020. Is bad news on TV tickers good news? The effects of voiceover and visual elements in video on viewers' assessment. PLOS ONE 15, 4 (04 2020), 1--18. Google ScholarCross Ref
S. Nam, D. I. Fels, and M. H. Chignell. 2020. Modeling Closed Captioning Subjective Quality Assessment by Deaf and Hard of Hearing Viewers. IEEE Transactions on Computational Social Systems 7, 3 (2020), 621--631.Google ScholarCross Ref
Ofcom. [n.d.]. Measuring live subtitling quality, UK. https://www.ofcom.org.uk/__data/assets/pdf_file/0019/45136/sampling-report.pdfGoogle Scholar
Andrew D. Ouzts, Nicole E. Snell, Prabudh Maini, and Andrew T. Duchowski. 2013. Determining Optima Caption Placement Using Eye Tracking. In Proceedings of the 31st ACM International Conference on Design of Communication (Greenville, North Carolina, USA) (SIGDOC '13). Association for Computing Machinery, New York, NY, USA, 189--190. Google ScholarDigital Library
Anni Rander and Peter Olaf Looms. 2010. The Accessibility of Television News with Live Subtitling on Digital Television. In Proceedings of the 8th European Conference on Interactive TV and Video (Tampere, Finland) (EuroITV '10). Association for Computing Machinery, New York, NY, USA, 155--160. Google ScholarDigital Library
Pablo Romero-Fresco and Juan Martínez Pérez. 2015. Accuracy Rate in Live Subtitling: The NER Model. Audiovisual Translation in a Global Context. Palgrave Studies in Translating and Interpreting. Palgrave Macmillan, London.Google Scholar
Society of Cable Telecommunications Engineers. SCTE. 2012. STANDARD FOR CARRIAGE OF VBI DATA IN CABLE DIGITAL TRANSPORT STREAMS. Technical Report.Google Scholar
Ruxandra Tapu, Bogdan Mocanu, and Titus Zaharia. 2019. DEEP-HEAR: A multimodal subtitle positioning system dedicated to deaf and hearing-impaired people. IEEE Access 7, 150--162 88 (2019).Google ScholarCross Ref
The Nielsen Company (US), LLC. 2020. THE NIELSEN TOTAL AUDIENCE REPORT: APRIL 2020. Technical Report.Google Scholar
Marcia Brooks Tom Apone, Brad Botkin and Larry Goldberg. 2011. Caption Accuracy Metrics Project Research into Automated Error Ranking of Real-time Captions in Live Television News Programs.Google Scholar
NewscastStudio. The trade publication for TV production professionals. 2020. TV News Graphics Package. https://www.newscaststudio.com/tv-news-graphics-package/Google Scholar
Friedrich Ungerer (Ed.). 2000. English Media Texts - Past and Present: Language and textual structure. John Benjamins. https://www.jbe-platform.com/content/books/9789027298959Google Scholar
Toinon Vigier, Yoann Baveye, Josselin Rousseau, and Patrick Le Callet. 2016. Visual attention as a dimension of QoE: Subtitles in UHD videos. In Proceedings of the Eighth International Conference on Quality of Multimedia Experience. 1--6.Google ScholarCross Ref
James M. Waller and Raja S. Kushalnagar. 2016. Evaluation of Automatic Caption Segmentation. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (Reno, Nevada, USA) (ASSETS '16). Association for Computing Machinery, New York, NY, USA, 331--332. Google ScholarDigital Library
Jennifer Wehrmeyer. 2014. Eye-tracking Deaf and hearing viewing of sign language interpreted news broadcasts. Journal of Eye Movement Research.Google ScholarCross Ref
Media Access Group (WGBH). 2019. Closed Captioning on TV in the United States 101. https://blog.snapstream.com/closed-captioning-on-tv-in-the-united-states-101Google Scholar
wTVision. 2020. Broadcast Design. https://www.wtvision.com/en/solutions/broadcast-design/Google Scholar
Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, and Jiajun Liang. 2017. EAST: An Efficient and Accurate Scene Text Detector. In the Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar

Index Terms

Caption-occlusion severity judgments across live-television genres from deaf and hard-of-hearing viewers
1. Human-centered computing
  1. Accessibility
    1. Empirical studies in accessibility

Recommendations

Understanding How Deaf and Hard of Hearing Viewers Visually Explore Captioned Live TV News
W4A '23: Proceedings of the 20th International Web for All Conference

Captions blocking visual information in live television news leads to dissatisfaction among Deaf and Hard of Hearing (DHH) viewers, who cannot see important information on the screen. Prior work has proposed generic guidelines for caption placement but ...
Read More
Watch It, Don’t Imagine It: Creating a Better Caption-Occlusion Metric by Collecting More Ecologically Valid Judgments from DHH Viewers
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Television captions blocking visual information causes dissatisfaction among Deaf and Hard of Hearing (DHH) viewers, yet existing caption evaluation metrics do not consider occlusion. To create such a metric, DHH participants in a recent study imagined ...
Read More
Who is speaking: Unpacking In-text Speaker Identification Preference of Viewers who are Deaf and Hard of Hearing while Watching Live Captioned Television Program
W4A '23: Proceedings of the 20th International Web for All Conference

Live TV news and interviews often include multiple individuals speaking, with rapid turn-taking, which makes it difficult for viewers who are Deaf and Hard of Hearing (DHH) to follow who is speaking when reading captions. Prior research has proposed ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
W4A '21: Proceedings of the 18th International Web for All Conference
April 2021
224 pages
ISBN:9781450382120
DOI:10.1145/3430263
General Chairs:
Silvia Rodriguez Vazquez
University of Geneva, Switzerland
,
Ted Drake
Intuit
,
Program Chairs:
Dragan Ahmetovic
University of Milan, Italy
,
Victoria Yaneva
NBME and University of Wolverhampton, UK
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 May 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
accessibility
caption
dataset
genre
metric
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate171of371submissions,46%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 128
  Total Downloads
- Downloads (Last 12 months)44
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Caption-occlusion severity judgments across live-television genres from deaf and hard-of-hearing viewers

W4A '21: Proceedings of the 18th International Web for All Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Understanding How Deaf and Hard of Hearing Viewers Visually Explore Captioned Live TV News

Watch It, Don’t Imagine It: Creating a Better Caption-Occlusion Metric by Collecting More Ecologically Valid Judgments from DHH Viewers

Who is speaking: Unpacking In-text Speaker Identification Preference of Viewers who are Deaf and Hard of Hearing while Watching Live Captioned Television Program

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media