ABSTRACT
Accessing images online is often difficult for users with vision impairments. This population relies on text descriptions of images that vary based on website authors' accessibility practices. Where one author might provide a descriptive caption for an image, another might provide no caption for the same image, leading to inconsistent experiences. In this work, we present the Caption Crawler system, which uses reverse image search to find existing captions on the web and make them accessible to a user's screen reader. We report our system's performance on a set of 481 websites from alexa.com's list of most popular sites to estimate caption coverage and latency, and also report blind and sighted users' ratings of our system's output quality. Finally, we conducted a user study with fourteen screen reader users to examine how the system might be used for personal browsing.
Supplemental Material
- Alexa top 500 global sites on the web, 2017. https://www.alexa.com/topsites.Google Scholar
- Bigham, J. P. (2007, January). Increasing web accessibility by automatically judging alternative text quality. In Proceedings of the 12th international conference on Intelligent user interfaces (pp. 349--352). ACM. Google ScholarDigital Library
- Bigham, J. P., Cavender, A. C., Brudvik, J. T., Wobbrock, J. O.,&Ladner, R. E. (2007, October). WebinSitu: a comparative analysis of blind and sighted browsing behavior. In Proceedings of the 9th international ACM SIGACCESS conference on Computers and accessibility (pp. 51--58). ACM. Google ScholarDigital Library
- Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ...&Yeh, T. (2010, October). VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 333342). ACM. Google ScholarDigital Library
- Bigham, J. P., Jayant, C., Miller, A., White, B.,&Yeh, T. (2010, June). VizWiz:: LocateIt-enabling blind people to locate objects in their environment. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on (pp. 65--72). IEEE.Google Scholar
- Bigham, J. P., Kaminsky, R. S., Ladner, R. E., Danielsson, O. M.,&Hempton, G. L. (2006, October). WebInSight:: making web images accessible. In Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility (pp. 181--188). ACM. Google ScholarDigital Library
- Blattner, M. M., Sumikawa, D. A.,&Greenberg, R. M. (1989). Earcons and icons: Their structure and common design principles. Human-Computer Interaction, 4(1), 11--44. Blattner, Meera M., Denise A. Sumikawa, and Robert M. Greenberg. "Earcons and icons: Their structure and common design principles." Human-Computer Interaction 4, no. 1 (1989): 11--44. Google ScholarDigital Library
- Brady, E. L., Zhong, Y., Morris, M. R.,&Bigham, J. P. (2013, February). Investigating the appropriateness of social network question asking as a resource for blind users. In Proceedings of the 2013 conference on Computer supported cooperative work (pp. 12251236). ACM. Google ScholarDigital Library
- Brady, E., Morris, M.R., and Bigham, J.P. Gauging Receptiveness to Social Microvolunteering. Proceedings of CHI 2015. Google ScholarDigital Library
- CaptionBot -- For pictures worth the thousand words, 2017. https://www.captionbot.ai.Google Scholar
- Diaper, D.,&Worman, L. (2004). Two Falls out of Three in the Automated Accessibility Assessment of World Wide Web Sites: A-Prompt vs. Bobby. In People and Computers XVII-Designing for Society (pp. 349--363). Springer, London.Dan Diaper, and Lindzy Worman. 2004. Two Falls out of Three in the Automated Accessibility Assessment of World Wide Web Sites: A-Prompt vs. Bobby. In People and Computers XVII-Designing for Society (pp. 349--363). Springer, London.Google Scholar
- Elzer, S., Schwartz, E., Carberry, S., Chester, D., Demir, S.,&Wu, P. (2007, March). A Browser Extension for Providing Visually Impaired Users Access to the Content of Bar Charts on the Web. In WEBIST (2) (pp. 59--66).Elzer, Stephanie, Edward Schwartz, Sandra Carberry, Daniel Chester, Seniz Demir, and Peng Wu. "A Browser Extension for Providing Visually Impaired Users Access to the Content of Bar Charts on the Web." In WEBIST (2), pp. 59--66. 2007.Google Scholar
- Fang, H., Gupta, S., Iandola, F., Srivastava, R.K., Deng, L., Dollar, P., Gao, J., He, X., Mitchell, M., Platt, J.C., Zitnick, C.L., and Zweig, G. From captions to visual concepts and back. Proceedings of CVPR 2015.Google ScholarCross Ref
- Goodwin, M., Susar, D., Nietzio, A., Snaprud, M., and Jensen, C.S. 2011. Global web accessibility analysis of national government portals and ministry web sites. Journal of Information Technology and Politics, 8(1), 41--67.Google ScholarCross Ref
- Harper, F.M., Raban, D., Rafaeli, S., and Konstan, J.A. Predictors of Answer Quality in Online Q&A Sites. Proceedings of CHI 2008, 865--874. Google ScholarDigital Library
- Image Insights | Microsoft Developer Network https://msdn.microsoft.com/enus/library/mt712790(v=bsynd.50).aspxGoogle Scholar
- Keysers, D., Renn, M.,&Breuel, T. M. (2007, October). Improving accessibility of HTML documents bygenerating image-tags in a proxy. In Proceedings of the 9th international ACM SIGACCESS conference on Computers and accessibility (pp. 249--250). ACM. Google ScholarDigital Library
- LaBarre, S.C. 2007. ABA Resolution and Report on Website Accessibility. Mental and Physical Disability Law Reporter. 31(4), 504--507.Google Scholar
- Loiacono, E.T., Romano, N.C., and McCoy, S. 2009. The state of corporate website accessibility. Communications of the ACM, 52(9), September 2009, 128--132. Google ScholarDigital Library
- MacLeod, H., Bennett, C. L., Morris, M. R.,&Cutrell, E. (2017, May). Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 5988--5999). ACM. Google ScholarDigital Library
- MDN Docs - Figcaption.& Aria-label | Mozilla Developer Network https://developer.mozilla.org/enUS/docs/Web/HTML/Element/figcaption https://developer.mozilla.org/enUS/docs/Web/Accessibility/ARIA/ARIA_Techniques/ Using_the_aria-label_attributeGoogle Scholar
- Morris, M. R., Zolyomi, A., Yao, C., Bahram, S., Bigham, J. P.,&Kane, S. K. (2016, May). With most of it being pictures now, I rarely use it: Understanding Twitter's Evolving Accessibility to Blind Users. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5506--5516). ACM. Google ScholarDigital Library
- Olalere, A. and Lazar, J. 2011. Accessibility of U.S. Federal Government Home Pages: Section 508 Compliance and Site Accessibility Statements. Government Information Quarterly, 28(3), 303--309.Google ScholarCross Ref
- Patil Swati, P., Pawar, B. V.,&Patil Ajay, S. (2013). Search Engine Optimization: A Study. Research Journal of Computer&Information Technology Sciences, 1(1), 10--13.Patil Swati, P., Pawar, B.V. and Patil Ajay, S., 2013. Search Engine Optimization: A Study. Research Journal of Computer&Information Technology Sciences, 1(1), pp.10--13.Google Scholar
- Power, C., Freire, A., Petrie, H., and Swallow, D. Guidelines are only half of the story: Accessibility problems encountered by blind users on the web. Proceedings of CHI 2012. Google ScholarDigital Library
- Ramnath, K., Baker, S., Vanderwende, L., El-Saban, M., Sinha, S. N., Kannan, A., ...&Bergamo, A. (2014, March). Autocaption: Automatic caption generation for personal photos. In Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on (pp. 10501057). IEEE. Krishnan Ramnath, Simon Baker, Lucy Vanderwende, Motaz El-Saban, Sudipta N. Sinha, Anitha Kannan, Noran Hassan, and Michel Galley, 2014, March. Autocaption: Automatic caption generation for personal photos. In Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on (pp. 1050--1057). IEEE.Google ScholarCross Ref
- Rodríguez Vázquez, S. (2016, April). Measuring the impact of automated evaluation tools on alternative text quality: a web translation study. In Proceedings of the 13th Web for All Conference (p. 32). ACM. Google ScholarDigital Library
- Rowe, N. C. (2002). Marie-4: A high-recall, selfimproving web crawler that finds images using captions. IEEE Intelligent Systems, 17(4), 8--14. Google ScholarDigital Library
- Salisbury, E., Kamar, E., and Morris, M.R. Toward Scalable Social Alt Text: Conversational Crowdsourcing as a Tool for Refining Vision-toLanguage Technology for the Blind. Proceedings of HCOMP 2017.Google Scholar
- Takagi, H., Kawanaka, S., Kobayashi, M., Itoh, T.,&Asakawa, C. (2008, October). Social accessibility: achieving accessibility through collaborative metadata authoring. In Proceedings of the 10th international ACM SIGACCESS conference on Computers and accessibility (pp. 193--200). ACM. Google ScholarDigital Library
- Teevan, J., Morris, M.R., and Panovich, K. Factors Affecting Response Quantity, Quality, and Speed for Questions Asked via Social Network Status Messages. Proceedings of ICWSM 2011.Google Scholar
- Telleen-Lawton, D., Chang, E. Y., Cheng, K. T.,&Chang, C. W. B. (2006, January). On usage models of content-based image search, filtering, and annotation. In Internet Imaging VII(Vol. 6061, p. 606102). International Society for Optics and Photonics.Google Scholar
- Tran, K., He, X., Zhang, L., Sun, J., Carapcea, C., Thrasher, C., Buehler, C., and Sienkiewicz, C. Rich Image Captioning in the Wild. Proceedings of CVPR 2016.Google ScholarCross Ref
- von Ahn, L., Ginosar, S., Kedia, M., Liu, R., and Blum, M. Improving accessibility of the web with a computer game. Proceedings of CHI 2006. Google ScholarDigital Library
- Voykinska, V., Azenkot, S., Wu, S., and Leshed, G. How Blind People Interact with Visual Content on Social Networking Services. Proceedings of CSCW 2016. Google ScholarDigital Library
- Web Content Accessibility Guidelines 2.0, W3C World Wide Web Consortium Recommendation 05 September 2017. (http://www.w3.org/TR/200X/RECWCAG20-20081211/)Google Scholar
- Wu, S., Wieland, J., Farivar, O.,&Schiller, J. (2017, February). Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 1180--1192). ACM. Google ScholarDigital Library
- Zhang, X., Li, Z.,&Chao, W. (2013). Improving image tags by exploiting web search results. Multimedia tools and applications, 62(3), 601--631. Google ScholarDigital Library
- Zhang, X., Ross, A. S., Caspi, A., Fogarty, J.,&Wobbrock, J. O. (2017, May). Interaction Proxies for Runtime Repair and Enhancement of Mobile Application Accessibility. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 6024--6037). ACM. Google ScholarDigital Library
Index Terms
- Caption Crawler: Enabling Reusable Alternative Text Descriptions using Reverse Image Search
Recommendations
Rich Representations of Visual Content for Screen Reader Users
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing SystemsAlt text (short for "alternative text") is descriptive text associated with an image in HTML and other document formats. Screen reader technologies speak the alt text aloud to people who are visually impaired. Introduced with HTML 2.0 in 1995, the alt ...
A ratification of means: international law and assistive technology in the developing world
ICTD '10: Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and DevelopmentSeveral nations around the world have ratified the UN Convention on the Rights of Persons with Disabilities (CRPD) since 2008. Ratifying states commit that national law will guarantee rights enumerated in the CRPD. The use of Assistive Technology (AT) ...
The role of DAISY digital talking books in the education of individuals with blindness: A pilot study
The present study is characterized as pilot and investigates the impact that different aural renderings have on blind individuals' comprehension. In specific, the present research attempts to compare the effective or active listening of participants ...
Comments