skip to main content
10.1145/3411764.3445402acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Designing Ground Truth and the Social Life of Labels

Published:07 May 2021Publication History

ABSTRACT

Ground-truth labeling is an important activity in machine learning. Many studies have examined how crowdworkers apply labels to records in machine learning datasets. However, there have been few studies that have examined the work of domain experts when their knowledge and expertise are needed to apply labels.

We provide a grounded account of the work of labeling teams with domain experts, including the experiences of labeling, collaborative configurations and work-practices, and quality issues. We show three major patterns in the social design of ground truth data: Principled design, Iterative design, and Improvisational design. We interpret our results through theories of from Human Centered Data Science, and particularly work on human interventions in data science work through the design and creation of data.

References

  1. José Manuel Álvarez and Antonio Lopez. 2008. Novel index for objective evaluation of road detection algorithms. In 2008 11th International IEEE Conference on Intelligent Transportation Systems. IEEE, 815–820.Google ScholarGoogle ScholarCross RefCross Ref
  2. Amol Ambardekar, Mircea Nicolescu, and Sergiu Dascalu. 2009. Ground truth verification tool (GTVT) for video surveillance systems. In 2009 Second International Conferences on Advances in Computer-Human Interactions. IEEE, 354–359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Theresa Dirndorfer Anderson and Nicola Parker. 2019. Keeping the human in the data scientist: Shaping human-centered data science education. Proceedings of the Association for Information Science and Technology 56, 1 (2019), 601–603.Google ScholarGoogle ScholarCross RefCross Ref
  5. Josh Andres, Christine T Wolf, Sergio Cabrero Barros, Erick Oduor, Rahul Nair, Alexander Kjærum, Anders Bech Tharsgaard, and Bo Schwartz Madsen. 2020. Scenario-based XAI for Humanitarian Aid Forecasting. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–8.Google ScholarGoogle Scholar
  6. Cecilia Aragon, Clayton Hutto, Andy Echenique, Brittany Fiore-Gartland, Yun Huang, Jinyoung Kim, Gina Neff, Wanli Xing, and Joseph Bayer. 2016. Developing a research agenda for human-centered data science. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion. ACM, 529–535.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sinem Aslan, Sinem Emine Mete, Eda Okur, Ece Oktay, Nese Alyuz, Utku Ergin Genc, David Stanhill, and Asli Arslan Esme. 2017. Human expert labeling process (HELP): towards a reliable higher-order user state labeling process and tool to assess student engagement. Educational Technology(2017), 53–59.Google ScholarGoogle Scholar
  8. Catherine M Baker, Lauren R Milne, and Richard E Ladner. 2019. Understanding the Impact of TVIs on Technology Use and Selection by Children with Visual Impairments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sarah Elsie Baker and Rosalind Edwards. 2012. How many qualitative interviews is enough? Expert voices and early career reflections on sampling and cases in qualitative research. (2012).Google ScholarGoogle Scholar
  10. Shaowen Bardzell, Daniela K Rosner, and Jeffrey Bardzell. 2012. Crafting quality in design: integrity, creativity, and public sensibility. In Proceedings of the Designing Interactive Systems Conference. 11–20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gregory Bateson. 2000. Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. University of Chicago Press.Google ScholarGoogle Scholar
  12. Jonathan Bean and Daniela Rosner. 2012. Old hat: craft versus design?interactions 19, 1 (2012), 86–88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael S Bernstein, Greg Little, Robert C Miller, Björn Hartmann, Mark S Ackerman, David R Karger, David Crowell, and Katrina Panovich. 2010. Soylent: a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. 313–322.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Daniel Bertaux. 1981. From the life-history approach to the transformation of sociological practice. Biography and society: The life history approach in the social sciences (1981), 29–45.Google ScholarGoogle Scholar
  15. Geoffrey C Bowker, C Geoffrey, W Bernard Carlson, 1994. Science on the run: Information management and industrial geophysics at Schlumberger, 1920-1940. MIT press.Google ScholarGoogle Scholar
  16. Jenna Breckenridge and Derek Jones. 2009. Demystifying theoretical sampling in grounded theory research.Grounded Theory Review 8, 2 (2009).Google ScholarGoogle Scholar
  17. Vitor R Carvalho, Matthew Lease, and Emine Yilmaz. 2011. Crowdsourcing for search evaluation. In ACM Sigir forum, Vol. 44. ACM New York, NY, USA, 17–22.Google ScholarGoogle Scholar
  18. Kathy Charmaz. 2014. Constructing grounded theory. sage.Google ScholarGoogle Scholar
  19. Kathy Charmaz and Antony Bryant. 2011. Grounded theory and credibility. Qualitative research 3(2011), 291–309.Google ScholarGoogle Scholar
  20. Veronika Cheplygina and Josien PW Pluim. 2018. Crowd disagreement about medical images is informative. In Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis. Springer, 105–111.Google ScholarGoogle Scholar
  21. Ming Cheung, James She, and Xiaopeng Li. 2015. Non-user generated annotation on user shared images for connection discovery. In 2015 IEEE International Conference on Data Science and Data Intensive Systems. IEEE, 204–209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Juliet Corbin and Anselm Strauss. 2014. Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage publications.Google ScholarGoogle Scholar
  23. Jonathan Corney, Andrew Lynn, Carmen Torres, Paola Di Maio, William Regli, Graeme Forbes, and Lynne Tobin. 2010. Towards crowdsourcing translation tasks in library cataloguing, a pilot study. In 4th IEEE International Conference on Digital Ecosystems and Technologies. IEEE, 572–577.Google ScholarGoogle ScholarCross RefCross Ref
  24. Frederick J Damerau, David E Johnson, and Martin C Buskirk Jr. 2004. Automatic labeling of unlabeled text data. US Patent 6,697,998.Google ScholarGoogle Scholar
  25. Michael Desmond, Kristina Brimijoin, Evelyn Duesterwald, Narendra Nath Joshi, Michael Muller, Zahra Ashktorab, Aabhas Sharma, Casey Dugan, and Qian Pan. 2020. AI=Assisted Data Labeling. Demo at NeurIPS 2020.Google ScholarGoogle Scholar
  26. Christian Dietz and Michael R Berthold. 2016. KNIME for open-source bioimage analysis: a tutorial. In Focus on Bio-Image Informatics. Springer, 179–197.Google ScholarGoogle Scholar
  27. Shari L Dworkin. 2012. Sample size policy for qualitative studies using in-depth interviews.Google ScholarGoogle Scholar
  28. Thomas Erickson and Wendy A Kellogg. 2003. Social translucence: using minimalist visualisations of social activity to support collective interaction. In Designing information spaces: The social navigation approach. Springer, 17–41.Google ScholarGoogle Scholar
  29. Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A Smith, and Mari Ostendorf. 2018. Sounding board: A user-centric and content-driven social chatbot. arXiv preprint arXiv:1804.10202(2018).Google ScholarGoogle Scholar
  30. Melanie Feinberg. 2017. A design perspective on data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2952–2963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Melanie Feinberg. 2017. Material Vision. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 604–617.Google ScholarGoogle Scholar
  32. P. M. Ferreira, T. Mendonça, J. Rozeira, and P. Rocha. 2012. An Annotation Tool for Dermoscopic Image Segmentation. In Proceedings of the 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications (Capri, Italy) (VIGTA ’12). Association for Computing Machinery, New York, NY, USA, Article 5, 6 pages. https://doi.org/10.1145/2304496.2304501Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Karën Fort. 2016. Collaborative Annotation for Reliable Natural Language Processing: Technical and Sociological Aspects. John Wiley & Sons.Google ScholarGoogle Scholar
  34. Susan Gasson and Jim Waters. 2013. Using a grounded theory approach to study online collaboration behaviors. European Journal of Information Systems 22, 1 (2013), 95–118.Google ScholarGoogle ScholarCross RefCross Ref
  35. Elihu M Gerson and Susan Leigh Star. 1986. Analyzing due process in the workplace. ACM Transactions on Information Systems (TOIS) 4, 3 (1986), 257–270.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Patty Gerstenblith. 2020. Provenience and Provenance Intersecting with International Law in the Market for Antiquities. NCJ Int’l L. 45(2020), 457.Google ScholarGoogle Scholar
  37. Eric Gilbert. 2012. Designing social translucence over social networks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2731–2740.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Lisa Gitelman. 2013. Raw data is an oxymoron. MIT press.Google ScholarGoogle Scholar
  39. Barney G Glaser and Anselm L Strauss. 2017. Discovery of grounded theory: Strategies for qualitative research. Routledge.Google ScholarGoogle Scholar
  40. Michele Goetz. 2017. 3 Ways Data Preparation Tools Help You Get Ahead Of Big Data. ”https://go.forrester.com/blogs/15-02-17-3_ways_data_preparation_tools_help_you_get_ahead_of_big_data/”.Google ScholarGoogle Scholar
  41. Charles Goodwin. 2000. Practices of color classification. Mind, culture, and activity 7, 1-2 (2000), 19–36.Google ScholarGoogle Scholar
  42. Catherine Grady and Matthew Lease. 2010. Crowdsourcing document relevance assessment with mechanical turk. In Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s mechanical turk. Association for Computational Linguistics, 172–179.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Mary L Gray and Siddharth Suri. 2019. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Eamon Dolan Books.Google ScholarGoogle Scholar
  44. Ben Green. 2018. Data science as political action: grounding data science in a politics of justice. arXiv preprint arXiv:1811.03435(2018).Google ScholarGoogle Scholar
  45. Greg Guest, Arwen Bunce, and Laura Johnson. 2006. How many interviews are enough? An experiment with data saturation and variability. Field methods 18, 1 (2006), 59–82.Google ScholarGoogle Scholar
  46. Philip J Guo, Sean Kandel, Joseph M Hellerstein, and Jeffrey Heer. 2011. Proactive wrangling: Mixed-initiative end-user programming of data transformation scripts. In Proceedings of the 24th annual ACM symposium on User interface software and technology. 65–74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ian Hampson and Anne Junor. 2005. Invisible work, invisible skills: interactive customer service as articulation work. New Technology, Work and Employment 20, 2 (2005), 166–181.Google ScholarGoogle ScholarCross RefCross Ref
  48. Kotaro Hara, Vicki Le, and Jon Froehlich. 2013. Combining crowdsourcing and google street view to identify street-level accessibility problems. In Proceedings of the SIGCHI conference on human factors in computing systems. 631–640.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Tony Hey, Stewart Tansley, Kristin Tolle, 2009. The fourth paradigm: data-intensive scientific discovery. Vol. 1. Microsoft research Redmond, WA.Google ScholarGoogle Scholar
  50. Humayun Irshad, Eun-Yeong Oh, Daniel Schmolze, Liza M Quintana, Laura Collins, Rulla M Tamimi, and Andrew H Beck. 2017. Crowdsourcing scoring of immunohistochemistry images: Evaluating performance of the crowd and an automated computational method. Scientific reports 7(2017), 43286.Google ScholarGoogle Scholar
  51. Narendra Nath Joshi, Aabhas Sharma, , Michael Muller, Qian Pan, Michael Desmond, Kristina Brimijoin, Zahra Ashktorab, Evelyn Duesterwald, and Casey Dugan. 2020. Fast and Automatic Visual Label Conflict Resolution. Demo at NeurIPS 2020.Google ScholarGoogle Scholar
  52. Hiroshi Kajino, Yuta Tsuboi, Issei Sato, and Hisashi Kashima. 2012. Learning from crowds and experts. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  53. Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Mary Beth Kery, Bonnie E John, Patrick O’Flaherty, Amber Horvath, and Brad A Myers. 2019. Towards Effective Foraging by Data Scientists to Find Past Analysis Choices. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Mary Beth Kery, Marissa Radensky, Mahima Arya, Bonnie E John, and Brad A Myers. 2018. The story in the notebook: Exploratory data science using a literate programming tool. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Rafal Kocielnik, Lillian Xiao, Daniel Avrahami, and Gary Hsieh. 2018. Reflection companion: a conversational system for engaging users in reflection on physical activity. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Marina Kogan, Aaron Halfaker, Shion Guha, Cecilia Aragon, Michael Muller, and Stuart Geiger. 2020. Mapping Out Human-Centered Data Science: Methods, Approaches, and Best Practices. In Companion of the 2020 ACM International Conference on Supporting Group Work. 151–156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Scott Krig. 2016. Ground truth data, content, metrics, and analysis. In Computer Vision Metrics. Springer, 247–271.Google ScholarGoogle Scholar
  59. Larry Laudan. 1978. Progress and its problems: Towards a theory of scientific growth. Vol. 282. Univ of California Press.Google ScholarGoogle Scholar
  60. Dong-Hyun Lee. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. 2.Google ScholarGoogle Scholar
  61. Diana Lynn MacLean and Jeffrey Heer. 2013. Identifying medical terms in patient-authored text: a crowdsourcing-based approach. Journal of the american medical informatics association 20, 6(2013), 1120–1127.Google ScholarGoogle ScholarCross RefCross Ref
  62. Mohd Aliff Abdul Majid, Mohhidin Othman, Siti Fatimah Mohamad, and Sarina Abdul Halim Lim. 2018. Achieving data saturation: evidence from a qualitative study of job satisfaction. Social and Management Research Journal 15, 2 (2018), 66–77.Google ScholarGoogle ScholarCross RefCross Ref
  63. David W McDonald, Stephanie Gokhman, and Mark Zachry. 2012. Building for social translucence: a domain analysis and prototype system. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 637–646.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Helena M Mentis, Ahmed Rahim, and Pierre Theodore. 2016. Crafting the image in surgical telemedicine. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 744–755.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Janice M Morse. 1995. The significance of saturation.Google ScholarGoogle Scholar
  67. Michael Muller. 2014. Curiosity, creativity, and surprise as analytic tools: Grounded theory method. In Ways of Knowing in HCI. Springer, 25–48.Google ScholarGoogle Scholar
  68. Michael Muller, Melanie Feinberg, Timothy George, Steven J Jackson, Bonnie E John, Mary Beth Kery, and Samir Passi. 2019. Human-Centered Study of Data Science Work Practices. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, W15.Google ScholarGoogle Scholar
  69. Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Gina Neff, Anissa Tanweer, Brittany Fiore-Gartland, and Laura Osburn. 2017. Critique and contribute: A practice-based framework for improving critical data studies and data science. Big data 5, 2 (2017), 85–97.Google ScholarGoogle Scholar
  71. Naveen Onkarappa and Angel D Sappa. 2015. Synthetic sequences and ground-truth flow field generation for algorithm validation. Multimedia Tools and Applications 74, 9 (2015), 3121–3135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Samir Passi and Steven Jackson. 2017. Data vision: Learning to see through algorithmic abstraction. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2436–2447.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Samir Passi and Steven J Jackson. 2018. Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Kanu Patel, Jay Vala, and Jaymit Pandya. 2014. Comparison of various classification algorithms on iris datasets using WEKA. Int. J. Adv. Eng. Res. Dev.(IJAERD) 1, 1 (2014).Google ScholarGoogle Scholar
  75. Sharoda A Paul, Lichan Hong, and Ed H Chi. 2011. What is a question? Crowdsourcing tweet categorization. In Workshop on Crowdsourcing and Human Computation at the Conference on Human Factors in Computing Systems (CHI).Google ScholarGoogle Scholar
  76. João Felipe Pimentel, Saumen Dey, Timothy McPhillips, Khalid Belhajjame, David Koop, Leonardo Murta, Vanessa Braganholo, and Bertram Ludäscher. 2016. Yin & Yang: demonstrating complementary provenance from noWorkflow & YesWorkflow. In International Provenance and Annotation Workshop. Springer, 161–165.Google ScholarGoogle ScholarCross RefCross Ref
  77. Kathleen H Pine and Max Liboiron. 2015. The politics of measurement and action. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3147–3156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Ivens Portugal, Paulo Alencar, and Donald Cowan. 2018. The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications 97 (2018), 205–227.Google ScholarGoogle ScholarCross RefCross Ref
  79. Alisha Pradhan, Ben Jelen, Katie A Siek, Joel Chan, and Amanda Lazar. 2020. Understanding Older Adults’ Participation in Design Workshops. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Krishna Rajan. 2013. Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application. Butterworth-Heinemann.Google ScholarGoogle Scholar
  81. Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. 2020. Snorkel: Rapid training data creation with weak supervision. The VLDB Journal 29, 2 (2020), 709–730.Google ScholarGoogle ScholarCross RefCross Ref
  82. Tye Rattenbury, Joseph M Hellerstein, Jeffrey Heer, Sean Kandel, and Connor Carreras. 2017. Principles of data wrangling: Practical techniques for data preparation. ” O’Reilly Media, Inc.”.Google ScholarGoogle Scholar
  83. Johan Redström. 2008. RE: Definitions of use. Design studies 29, 4 (2008), 410–423.Google ScholarGoogle Scholar
  84. Adrienne Rich. 1995. On lies, secrets, and silence: Selected prose 1966-1978. WW Norton & Company.Google ScholarGoogle Scholar
  85. Yuji Roh, Geon Heo, and Steven Euijong Whang. 2019. A survey on data collection for machine learning: a big data-ai integration perspective. IEEE Transactions on Knowledge and Data Engineering (2019).Google ScholarGoogle Scholar
  86. Adam Rule, Aurélien Tabard, and James D Hollan. 2018. Exploration and explanation in computational notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Manaswi Saha, Michael Saugstad, Hanuma Teja Maddali, Aileen Zeng, Ryan Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, 2019. Project sidewalk: A web-based crowdsourcing tool for collecting sidewalk accessibility data at scale. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Sheeba Samuel and Birgitta König-Ries. 2018. ProvBook: Provenance-based Semantic Enrichment of Interactive Notebooks for Reproducibility.. In International Semantic Web Conference (P&D/Industry/BlueSky).Google ScholarGoogle Scholar
  89. Sheeba Samuel and Birgitta König-Ries. 2020. ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks. arXiv preprint arXiv:2006.12110(2020).Google ScholarGoogle Scholar
  90. Mike Schaekermann, Graeme Beaton, Minahz Habib, L. I.M. Andrew, Kate Larson, and L. A.W. Edith. 2019. Understanding expert disagreement in medical data analysis through structured adjudication. , 23 pages. https://doi.org/10.1145/3359178Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Mike Schaekermann, Carrie J Cai, Abigail E Huang, and Rory Sayres. 2020. Expert Discussions Improve Comprehension of Difficult Cases in Medical Image Assessment. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Kjeld Schmidt. 2002. Remarks on the complexity of cooperative work.Revue d’intelligence artificielle 16, 4-5 (2002), 443–483.Google ScholarGoogle Scholar
  93. Donald A Schön. 1992. Designing as reflective conversation with the materials of a design situation. Knowledge-based systems 5, 1 (1992), 3–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Philipp Schorch. 2020. Sensitive Heritage: Ethnographic Museums, Provenance Research, and the Potentialities of Restitutions. Museum and Society 18, 1 (2020), 1–5.Google ScholarGoogle ScholarCross RefCross Ref
  95. Isabella Seeber, Eva Bittner, Robert O Briggs, Triparna de Vreede, Gert-Jan De Vreede, Aaron Elkins, Ronald Maier, Alexander B Merz, Sarah Oeste-Reiß, Nils Randrup, 2020. Machines as teammates: A research agenda on AI in team collaboration. Information & management 57, 2 (2020), 103174.Google ScholarGoogle Scholar
  96. Cathrine Seidelin, Yvonne Dittrich, and Eric Grönvall. [n.d.]. Co-designing data experiments. ([n. d.]). (in preparation).Google ScholarGoogle Scholar
  97. Cathrine Seidelin, Yvonne Dittrich, and Erik Grönvall. 2018. Data Work in a Knowledge-Broker Organisation: How Cross-Organisational Data Maintenance Shapes Human Data Interactions. In Proceedings of the 32nd International BCS Human Computer Interaction Conference (Belfast, United Kingdom) (HCI ’18). BCS Learning & Development Ltd., Swindon, GBR, Article 14, 12 pages. https://doi.org/10.14236/ewic/HCI2018.14Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Burr Settles. 2009. Active learning literature survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.Google ScholarGoogle Scholar
  99. Ayush Singhal, Pradeep Sinha, and Rakesh Pant. 2017. Use of deep learning in modern recommendation system: A summary of recent works. arXiv preprint arXiv:1712.07525(2017).Google ScholarGoogle Scholar
  100. Susan Leigh Star. 1999. The ethnography of infrastructure. American behavioral scientist 43, 3 (1999), 377–391.Google ScholarGoogle Scholar
  101. Susan Leigh Star and Karen Ruhleder. 1996. Steps toward an ecology of infrastructure: Design and access for large information spaces. Information systems research 7, 1 (1996), 111–134.Google ScholarGoogle Scholar
  102. Susan Leigh Star and Anselm Strauss. 1999. Layers of silence, arenas of voice: The ecology of visible and invisible work. Computer supported cooperative work (CSCW) 8, 1-2 (1999), 9–30.Google ScholarGoogle Scholar
  103. Stephanie B Steinhardt and Steven J Jackson. 2015. Anticipation work: Cultivating vision in collective practice. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 443–453.Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. P.N. Stern. 2007. Properties for growing grounded theory. In The Sage handbook of grounded theory, A. Bryant and K. Charmaz (Eds.). Sage, Thousand Oaks, CA, USA.Google ScholarGoogle Scholar
  105. Miriam Sturdee, John Hardy, Nick Dunn, and Jason Alexander. 2015. A public ideation of shape-changing applications. In Proceedings of the 2015 International Conference on Interactive Tabletops & Surfaces. 219–228.Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Lucy Suchman. 2002. Located accountabilities in technology production. Scandinavian journal of information systems 14, 2 (2002), 7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Charles Sutton, Timothy Hobson, James Geddes, and Rich Caruana. 2018. Data diff: Interpretable, executable summaries of changes in distributions for data wrangling. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2279–2288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Madhusmita Swain, Sanjit Kumar Dash, Sweta Dash, and Ayeskanta Mohapatra. 2012. An approach for iris plant classification using neural network. International Journal on Soft Computing 3, 1 (2012), 79.Google ScholarGoogle ScholarCross RefCross Ref
  109. Anissa Tanweer. 2018. Data science of the social: How the practice is responding to ethical crisis and spreading across sectors. Ph.D. Dissertation.Google ScholarGoogle Scholar
  110. Natalia Tognoli and José Augusto Chaves Guimarães. 2020. Provenance as a Knowledge Organization Principle. KO KNOWLEDGE ORGANIZATION 46, 7 (2020), 558–568.Google ScholarGoogle ScholarCross RefCross Ref
  111. Wil MP Van der Aalst. 2014. Data scientist: The engineer of the future. In Enterprise interoperability VI. Springer, 13–26.Google ScholarGoogle Scholar
  112. Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine Learning 109, 2 (2020), 373–440.Google ScholarGoogle ScholarCross RefCross Ref
  113. Luis Von Ahn. 2008. Human computation. In 2008 IEEE 24th international conference on data engineering. IEEE, 1–2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. Dakuo Wang, Justin D. Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. 2019. Human-AI Collaboration in Data Science. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (Nov 2019), 1–24. https://doi.org/10.1145/3359313Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Dakuo Wang, Justin D Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. 2019. Human-AI Collaboration in Data Science: Exploring Data Scientists’ Perceptions of Automated AI. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Daniel Karl I Weidele, Justin D Weisz, Erick Oduor, Michael Muller, Josh Andres, Alexander Gray, and Dakuo Wang. 2020. AutoAIViz: opening the blackbox of automated artificial intelligence with conditional parallel coordinates. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 308–312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Jacob Whitehill, Paul Ruvolo, Tingfan Wu, Jacob Bergsma, and Javier Movellan. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference. 2035–2043.Google ScholarGoogle Scholar
  118. Andrea Wiggins, Greg Newman, Robert D Stevenson, and Kevin Crowston. 2011. Mechanisms for data quality and validation in citizen science. In 2011 IEEE Seventh International Conference on e-Science Workshops. IEEE, 14–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Peter Woitek, Paul Bräuer, and Holger Grossmann. 2010. A Novel Tool for Capturing Conceptualized Audio Annotations(AM ’10). Association for Computing Machinery, New York, NY, USA, Article 15, 8 pages. https://doi.org/10.1145/1859799.1859814Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. Christine T. Wolf. 2019. Conceptualizing Care in the Everyday Work Practices of Machine Learning Developers. In Companion Publication of the 2019 on Designing Interactive Systems Conference 2019 Companion (San Diego, CA, USA) (DIS ’19 Companion). Association for Computing Machinery, New York, NY, USA, 331–335. https://doi.org/10.1145/3301019.3323879Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Christine T Wolf. 2020. AI Models and Their Worlds: Investigating Data-Driven, AI/ML Ecosystems Through a Work Practices Lens. In International Conference on Information. Springer, 651–664.Google ScholarGoogle ScholarCross RefCross Ref
  122. Matthew Yapchain. 2018. Human-Centered Data Science: A New Paradigm for Industrial IoT. In Ethnographic Praxis in Industry Conference Proceedings, Vol. 2018. Wiley Online Library, 53–61.Google ScholarGoogle Scholar
  123. Amy X Zhang, Michael Muller, and Dakuo Wang. 2020. How do Data Science Workers Collaborate? Roles, Workflows, and Tools. arXiv preprint arXiv:2001.06684(2020).Google ScholarGoogle Scholar
  124. Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How do Data Science Workers Collaborate? Roles, Workflows, and Tools. In Proc. ACM Hum.-Comput. Interact.Article 22. Issue CSCW1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Lei Zhang, Yan Tong, and Qiang Ji. 2008. Active image labeling and its application to facial action labeling. In European Conference on Computer Vision. Springer, 706–719.Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. (2002).Google ScholarGoogle Scholar
  127. Laszlo Zsolnai. 1998. Rational choice and the diversity of choices. The Journal of Socio-Economics 27, 5 (1998), 613–622.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Designing Ground Truth and the Social Life of Labels
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
      May 2021
      10862 pages
      ISBN:9781450380966
      DOI:10.1145/3411764

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 May 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate6,199of26,314submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format