skip to main content
10.1145/3469096.3469873acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article

Heuristic stopping rules for technology-assisted review

Published:16 August 2021Publication History

ABSTRACT

Technology-assisted review (TAR) refers to human-in-the-loop active learning workflows for finding relevant documents in large collections. These workflows often must meet a target for the proportion of relevant documents found (i.e. recall) while also holding down costs. A variety of heuristic stopping rules have been suggested for striking this tradeoff in particular settings, but none have been tested against a range of recall targets and tasks. We propose two new heuristic stopping rules, Quant and QuantCI based on model-based estimation techniques from survey research. We compare them against a range of proposed heuristics and find they are accurate at hitting a range of recall targets while substantially reducing review costs.

References

  1. Mossaab Bagdouri, William Webber, David D Lewis, and Douglas W Oard. 2013. Towards minimizing the annotation cost of certified text classification. In CIKM 2013. ACM, 989--998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J.R. Baron, R.C. Losey, and M.D. Berman. 2016. Perspectives on Predictive Coding: And Other Advanced Search Methods for the Legal Practitioner. American Bar Association, Section of Litigation. https://books.google.com/books?id=TdJ2AQAACAAJGoogle ScholarGoogle Scholar
  3. Jason R Baron, Mahmoud F Sayed, and Douglas W Oard. 2020. Providing More Efficient Access To Government Records: A Use Case Involving Application of Machine Learning to Improve FOIA Review for the Deliberative Process Privilege. arXiv preprint arXiv:2011.07203 (2020).Google ScholarGoogle Scholar
  4. Max W Callaghan and Finn Müller-Hansen. 2020. Statistical stopping criteria for automated screening in systematic reviews. Systematic Reviews 9, 1 (2020), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  5. Gordon F. Cormack and Maura F. Grossman. 2014. Evaluation of machine-learning protocols for technology-assisted review in electronic discovery. SIGIR 2014 (2014), 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gordon V Cormack and Maura R Grossman. 2015. Autonomy and reliability of continuous active learning for technology-assisted review. arXiv preprint arXiv:1504.06868 (2015).Google ScholarGoogle Scholar
  7. Gordon V Cormack and Maura R Grossman. 2015. Waterloo (Cormack) Participation in the TREC 2015 Total Recall Track.. In TREC.Google ScholarGoogle Scholar
  8. Gordon V. Cormack and Maura R. Grossman. 2016. Engineering Quality and Reliability in Technology-Assisted Review. In SIGIR. ACM Press, Pisa, Italy, 75--84. 00024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gordon V Cormack and Maura R Grossman. 2016. Scalability of continuous active learning for reliable high-recall text classification. In Proceedings of the 25th ACM international on conference on information and knowledge management. 1039--1048.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A Philip Dawid. 1982. The well-calibrated Bayesian. J. Amer. Statist. Assoc. 77, 379 (1982), 605--610.Google ScholarGoogle ScholarCross RefCross Ref
  11. Wei Gao and Fabrizio Sebastiani. 2016. From classification to quantification in tweet sentiment analysis. Social Network Analysis and Mining 6, 1 (2016), 19.Google ScholarGoogle ScholarCross RefCross Ref
  12. Maura R. Grossman, Gordon V. Cormack, and Adam Roegiest. 2016. TREC 2016 Total Recall Track Overview.Google ScholarGoogle Scholar
  13. Theodore C Hirt. 2011. Applying Proportionality Principles in Electronic Discovery-Lessons for Federal Agencies and Their Litigators. US Att'ys Bull. 59 (2011), 43.Google ScholarGoogle Scholar
  14. Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2017. CLEF 2017 Technologically Assisted Reviews in Empirical Medicine Overview. In CEUR workshop proceedings, Vol. 1866. 1--29.Google ScholarGoogle Scholar
  15. Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2018. CLEF 2018 Technologically Assisted Reviews in Empirical Medicine Overview. CEUR Workshop Proceedings 2125 (July 2018). https://strathprints.strath.ac.uk/66446/Google ScholarGoogle Scholar
  16. Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2019. CLEF 2019 technology assisted reviews in empirical medicine overview. In CEUR workshop proceedings, Vol. 2380.Google ScholarGoogle Scholar
  17. David D Lewis. 1995. Evaluating and optimizing autonomous text classification systems. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval. 246--254.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David D Lewis. 2016. Defining and Estimating Effectiveness in Document Review. In Perspectives on Predictive Coding: And Other Advanced Search Methods for the Legal Practitioner. American Bar Association, Section of Litigation.Google ScholarGoogle Scholar
  19. David D. Lewis and William A Gale. 1994. A sequential algorithm for training text classifiers. In SIGIR 1994. 3--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David D Lewis, Eugene Yang, and Ophir Frieder. 2021. On Sample-Based Stopping Rules for Technology-Assisted Review (Under review). (2021).Google ScholarGoogle Scholar
  21. David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. 2004. RCV1: A New Benchmark Collection for Text Categorization Research. JMLR 5(2004), 361--397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dan Li and Evangelos Kanoulas. 2020. When to Stop Reviewing in Technology-Assisted Reviews: Sampling from an Adaptive Distribution to Estimate Residual Relevant Documents. ACM Transactions on Information Systems (TOIS) 38, 4 (2020), 1--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Alex Papanicolaou. 2009. Taylor approximation and the delta method. April 28 (2009), 2009.Google ScholarGoogle Scholar
  24. Stephen Robertson. 2004. Understanding inverse document frequency: on theoretical arguments for IDF. JDoc 60, 5 (2004), 503--520.Google ScholarGoogle Scholar
  25. Joseph John Rocchio. 1971. Relevance feedback in information retrieval. (1971).Google ScholarGoogle Scholar
  26. Adam Roegiest and Gordon V. Cormack. 2015. TREC 2015 Total Recall Track Overview. (2015).Google ScholarGoogle Scholar
  27. Herbert L Roitblat. 2007. Search and information retrieval science. In Sedona Conf. J., Vol. 8. HeinOnline, 225.Google ScholarGoogle Scholar
  28. T. K. Saha, M. A. Hasan, C. Burgess, M. A. Habib, and J. Johnson. 2015. Batch-mode active learning for technology-assisted review. In 2015 IEEE International Conference on Big Data (Big Data). 1134--1143. 00003.Google ScholarGoogle Scholar
  29. Carl-Erik Särndal, Ib Thomsen, Jan M Hoem, DV Lindley, O Barndorff-Nielsen, and Tore Dalenius. 1978. Design-based and model-based inference in survey sampling [with discussion and reply]. Scandinavian Journal of Statistics (1978), 27--52.Google ScholarGoogle Scholar
  30. T Thomas, C Scovel, C Kruger, and J Shumate. 1995. Text to information: Sampling uncertainty in an example from physician/patient encounters. In Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval", Information Science Research Institute, University of Las Vegas. 347--58.Google ScholarGoogle Scholar
  31. Richard Valliant, Jill A Dever, and Frauke Kreuter. 2013. Practical tools for designing and weighting survey samples. Springer.Google ScholarGoogle Scholar
  32. Byron C Wallace, Thomas A Trikalinos, Joseph Lau, Carla Brodley, and Christopher H Schmid. 2010. Semi-automated screening of biomedical citations for systematic reviews. BMC bioinformatics 11, 1 (2010), 55.Google ScholarGoogle Scholar
  33. William Webber. 2013. Approximate recall confidence intervals. ACM Transactions on Information Systems (TOIS) 31, 1 (2013), 1--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. William Webber, Mossaab Bagdouri, David D Lewis, and Douglas W Oard. 2013. Sequential testing in classifier evaluation yields biased estimates of effectiveness. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 933--936.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Eugene Yang, David D Lewis, and Ophir Frieder. 2019. Text Retrieval Priors for Bayesian Logistic Regression. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1045--1048.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Eugene Yang, David D Lewis, and Ophir Frieder. 2021. On Minimizing Cost in Legal Document Review Workflows (Under review). (2021).Google ScholarGoogle Scholar
  37. Yao-Yuan Yang, Shao-Chuan Lee, Yu-An Chung, Tung-En Wu, Si-An Chen, and Hsuan-Tien Lin. 2017. libact: Pool-based Active Learning in Python. Technical Report. National Taiwan University. https://github.com/ntucllab/libactGoogle ScholarGoogle Scholar

Index Terms

  1. Heuristic stopping rules for technology-assisted review

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering
          August 2021
          178 pages
          ISBN:9781450385961
          DOI:10.1145/3469096

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 August 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate178of537submissions,33%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader