skip to main content
10.1145/3430263.3452429acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesw4aConference Proceedingsconference-collections
research-article

Caption-occlusion severity judgments across live-television genres from deaf and hard-of-hearing viewers

Published:20 May 2021Publication History

ABSTRACT

Prior work has revealed that Deaf and Hard of Hearing (DHH) viewers are concerned about captions occluding other onscreen content, e.g. text or faces, especially for live television programming, for which captions are generally not manually placed. To support evaluation or placement of captions for several genres of live television, empirical evidence is needed on how DHH viewers prioritize onscreen information, and whether this varies by genre. Nineteen DHH participants rated the importance of various onscreen content regions across 6 genres: News, Interviews, Emergency Announcements, Political Debates, Weather News, and Sports. Importance of content regions varied significantly across several genres, motivating genre-specific caption placement. We also demonstrate how the dataset informs creation of importance-weights for a metric to predict the severity of captions occluding onscreen content. This metric correlated significantly better to 23 DHH participants' judgements of caption quality, compared to a metric with uniform importance-weights of content regions.

References

  1. Ahmed Ali and Steve Renals. 2018. Word Error Rate Estimation for Speech Recognition: e-WER. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 20--24. Google ScholarGoogle ScholarCross RefCross Ref
  2. Ba Tu Truong and C. Dorai. 2000. Automatic genre identification for content-based video categorization. In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, Vol. 4. 230--233 vol.4.Google ScholarGoogle ScholarCross RefCross Ref
  3. BBC. 2019. BBC Subtitle Guidelines, 2018. https://bbc.github.io/subtitle-guidelinesGoogle ScholarGoogle Scholar
  4. Larwan Berke, Khaled Albusays, Matthew Seita, and Matt Huenerfauth. 2019. Preferred Appearance of Captions Generated by Automatic Speech Recognition for Deaf and Hard-of-Hearing Viewers. In Extended Abstracts of the 2019 CHI Conference on Human FaWERctors in Computing Systems (Glasgow, Scotland Uk) (CHI EA '19). Association for Computing Machinery, New York, NY, USA, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bonnie B. Blanchfield, Jacob J. Feldman, Jennifer L. Dunbar, and Eric N. Gardner. 2001. The severely to profoundly hearing-impaired population in the United States: prevalence estimates and demographics. Journal of the American Academy of Audiology 12, 4 (2001), 183--9. http://www.ncbi.nlm.nih.gov/pubmed/11332518Google ScholarGoogle ScholarCross RefCross Ref
  6. Andy Brown, Rhia Jones, Mike Crabb, James Sandford, Matthew Brooks, Mike Armstrong, and Caroline Jay. 2015. Dynamic subtitles: The user experience. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video. 103--112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Federal Communications Commission. 2014. Closed Captioning Quality Report and Order, Declaratory Ruling, FNPRM. Retrieved from:. https://www.fcc.gov/document/closed-captioning-quality-report-and-order-declaratory-ruling-fnprmGoogle ScholarGoogle Scholar
  8. Leon Cruickshank, Emmanuel Tsekleves, Roger Whitham, Annette Hill, and Kaoruko Kondo. 2007. Making Interactive tv Easier to Use: Interface Design for a Second Screen Approach. The Design Journal 10, 3 (2007), 41--53. arXiv:https://doi.org/10.2752/146069207789271920 Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Cushion. 2015. News and Politics: The Rise of Live and Interpretive Journalism. Taylor & Francis. https://books.google.com/books?id=d8kqBwAAQBAJGoogle ScholarGoogle Scholar
  10. Stephen Cushion, Rachel Lewis, and Hugh Roger. 2015. Adopting or resisting 24-hour news logic on evening bulletins? The mediatization of UK television news 19912012. Journalism 16, 7 (2015), 866--883. arXiv:https://doi.org/10.1177/1464884914550975 Google ScholarGoogle ScholarCross RefCross Ref
  11. The Described and Captioned Media Program. [n.d.]. Captioning Key for Educational Media, Guidelines and Preferred Technique. Retrieved from:. http://access-ed.r2d2.uwm.edu/resources/captioning-key.pdfGoogle ScholarGoogle Scholar
  12. FFMPEG Developers. 2016. ffmpeg tool (Version be1d324) [Software]. http://ffmpeg.org/Google ScholarGoogle Scholar
  13. David Fairbairn and Milad Niroumand Jadidi. 2013. Influential Visual Design Parameters on TV Weather Maps. The Cartographic Journal 50, 4 (2013), 311--323. arXiv:https://doi.org/10.1179/1743277413Y.0000000040 Google ScholarGoogle ScholarCross RefCross Ref
  14. NFL Football Operation. Retrieved from:. [n.d.]. QUICK GUIDE TO NFL TV GRAPHICS. https://operations.nfl.com/football-101/quick-guide-to-nfl-tv-graphics/Google ScholarGoogle Scholar
  15. Olivia Gerber-Morón, Agnieszka Szarkowska, and Bencie Woll. 2018. The impact of text segmentation on subtitle reading. Journal of Eye Movement Research 6 5 (2018).Google ScholarGoogle Scholar
  16. Stephen R. Gulliver and Gheorghita Ghinea. 2003a. How level and type of deafness affect user perception of multimedia video clips. Inform. Soc. J. 2 2, 4 (2003a), 374--386.Google ScholarGoogle Scholar
  17. Stephen R. Gulliver and Gheorghita Ghinea. 2003b. Impact of captions on hearing impaired and hearing perception of multimedia video clipsb. In Proceedings of the IEEE International Conference on Multimedia and Expo.Google ScholarGoogle Scholar
  18. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  19. Richang Hong, Meng Wang, Xiao-Tong Yuan, Mengdi Xu, Jianguo Jiang, Shuicheng Yan, and Tat-Seng Chua. 2011. Video Accessibility Enhancement for Hearing-Impaired Users. ACM Trans. Multimedia Comput. Commun. Appl. 7S, 1, Article 24 (Nov. 2011), 19 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yongtao Hu, Jan Kautz, Yizhou Yu, and Wenping Wang. 2014. Speaker-following video subtitles. ACM Transactions on Multimedia Computing, Communications, and Applications 11 2 (2014).Google ScholarGoogle Scholar
  21. Compix Media Inc. 2019. Compix BROADCAST GRAPHICS. https://www.compix.tv/Google ScholarGoogle Scholar
  22. Bo Jiang, Sijiang Liu, Liping He, Weimin Wu, Hongli Chen, and Yunfei Shen. 2017. Subtitle positioning for e-learning videos based on rough gaze estimation and saliency detection. In SIGGRAPH Asia Posters. 15--16.Google ScholarGoogle Scholar
  23. Sushant Kafle and Matt Huenerfauth. 2019. Predicting the Understandability of Imperfect English Captions for People Who Are Deaf or Hard of Hearing. ACM Trans. Access. Comput. 12, 2, Article 7 June 2019), 32 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sushant Kafle, Peter Yeung, and Matt Huenerfauth. 2019. Evaluating the Benefit of Highlighting Key Words in Captions for People Who Are Deaf or Hard of Hearing. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS '19). Association for Computing Machinery, New York, NY, USA, 43--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Terry K. Koo and Mae Y. Li. 2016. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine 15, 2 (2016), 155 -- 163. Google ScholarGoogle ScholarCross RefCross Ref
  26. Kuno Kurzhals, Emine Cetinkaya, Yongtao Hu, Wenping Wang, and Daniel Weiskopf. 2017. Close to the action: Eye-tracking evaluation of speaker-following subtitles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 6559--6568.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kuno Kurzhals, Fabian Göbel, Katrin Angerbauer, Michael Sedlmair, and Martin Raubal. 2020. A View on the Viewer: Gaze-Adaptive Captions for Videos. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Raja S. Kushalnagar, Walter S. Lasecki, and Jeffrey P. Bigham. 2014. Accessibility Evaluation of Classroom Captions. ACM Trans. Access. Comput. 5, 3, Article 7 (Jan. 2014), 24 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. DANIEL G. LEE, DEBORAH I. FELS, and JOHN PATRICK UDO. 2007. Emotive Captioning. Comput. Entertain. 5, 2, Article 11 (April 2007), 15 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. W. Luplow and J. Kutzner. 2013. Emergency alerts to people on-the-go via terrestrial broadcasting: The M-EAS system. In 2013 IEEE International Conference on Technologies for Homeland Security (HST). 779--783.Google ScholarGoogle Scholar
  31. Obach M, Lehr M, and Arruti A. 2007. Automatic speech recognition for live TV subtitling for hearing-impaired people. Challenges for Assistive Technology: AAATE 07 20 (2007), 286.Google ScholarGoogle Scholar
  32. Konrad Maj and Stephan Lewandowsky. 2020. Is bad news on TV tickers good news? The effects of voiceover and visual elements in video on viewers' assessment. PLOS ONE 15, 4 (04 2020), 1--18. Google ScholarGoogle ScholarCross RefCross Ref
  33. S. Nam, D. I. Fels, and M. H. Chignell. 2020. Modeling Closed Captioning Subjective Quality Assessment by Deaf and Hard of Hearing Viewers. IEEE Transactions on Computational Social Systems 7, 3 (2020), 621--631.Google ScholarGoogle ScholarCross RefCross Ref
  34. Ofcom. [n.d.]. Measuring live subtitling quality, UK. https://www.ofcom.org.uk/__data/assets/pdf_file/0019/45136/sampling-report.pdfGoogle ScholarGoogle Scholar
  35. Andrew D. Ouzts, Nicole E. Snell, Prabudh Maini, and Andrew T. Duchowski. 2013. Determining Optima Caption Placement Using Eye Tracking. In Proceedings of the 31st ACM International Conference on Design of Communication (Greenville, North Carolina, USA) (SIGDOC '13). Association for Computing Machinery, New York, NY, USA, 189--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Anni Rander and Peter Olaf Looms. 2010. The Accessibility of Television News with Live Subtitling on Digital Television. In Proceedings of the 8th European Conference on Interactive TV and Video (Tampere, Finland) (EuroITV '10). Association for Computing Machinery, New York, NY, USA, 155--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pablo Romero-Fresco and Juan Martínez Pérez. 2015. Accuracy Rate in Live Subtitling: The NER Model. Audiovisual Translation in a Global Context. Palgrave Studies in Translating and Interpreting. Palgrave Macmillan, London.Google ScholarGoogle Scholar
  38. Society of Cable Telecommunications Engineers. SCTE. 2012. STANDARD FOR CARRIAGE OF VBI DATA IN CABLE DIGITAL TRANSPORT STREAMS. Technical Report.Google ScholarGoogle Scholar
  39. Ruxandra Tapu, Bogdan Mocanu, and Titus Zaharia. 2019. DEEP-HEAR: A multimodal subtitle positioning system dedicated to deaf and hearing-impaired people. IEEE Access 7, 150--162 88 (2019).Google ScholarGoogle ScholarCross RefCross Ref
  40. The Nielsen Company (US), LLC. 2020. THE NIELSEN TOTAL AUDIENCE REPORT: APRIL 2020. Technical Report.Google ScholarGoogle Scholar
  41. Marcia Brooks Tom Apone, Brad Botkin and Larry Goldberg. 2011. Caption Accuracy Metrics Project Research into Automated Error Ranking of Real-time Captions in Live Television News Programs.Google ScholarGoogle Scholar
  42. NewscastStudio. The trade publication for TV production professionals. 2020. TV News Graphics Package. https://www.newscaststudio.com/tv-news-graphics-package/Google ScholarGoogle Scholar
  43. Friedrich Ungerer (Ed.). 2000. English Media Texts - Past and Present: Language and textual structure. John Benjamins. https://www.jbe-platform.com/content/books/9789027298959Google ScholarGoogle Scholar
  44. Toinon Vigier, Yoann Baveye, Josselin Rousseau, and Patrick Le Callet. 2016. Visual attention as a dimension of QoE: Subtitles in UHD videos. In Proceedings of the Eighth International Conference on Quality of Multimedia Experience. 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  45. James M. Waller and Raja S. Kushalnagar. 2016. Evaluation of Automatic Caption Segmentation. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (Reno, Nevada, USA) (ASSETS '16). Association for Computing Machinery, New York, NY, USA, 331--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jennifer Wehrmeyer. 2014. Eye-tracking Deaf and hearing viewing of sign language interpreted news broadcasts. Journal of Eye Movement Research.Google ScholarGoogle ScholarCross RefCross Ref
  47. Media Access Group (WGBH). 2019. Closed Captioning on TV in the United States 101. https://blog.snapstream.com/closed-captioning-on-tv-in-the-united-states-101Google ScholarGoogle Scholar
  48. wTVision. 2020. Broadcast Design. https://www.wtvision.com/en/solutions/broadcast-design/Google ScholarGoogle Scholar
  49. Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, and Jiajun Liang. 2017. EAST: An Efficient and Accurate Scene Text Detector. In the Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar

Index Terms

  1. Caption-occlusion severity judgments across live-television genres from deaf and hard-of-hearing viewers

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      W4A '21: Proceedings of the 18th International Web for All Conference
      April 2021
      224 pages
      ISBN:9781450382120
      DOI:10.1145/3430263

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 May 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate171of371submissions,46%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader