skip to main content
10.1145/3409256.3409824acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Training Data Optimization for Pairwise Learning to Rank

Published:14 September 2020Publication History

ABSTRACT

This paper studies data optimization for Learning to Rank (LtR), by dropping training labels to increase ranking accuracy. Our work is inspired by data dropout, showing some training data do not positively influence learning and are better dropped out, despite a common belief that a larger training dataset is beneficial. Our main contribution is to extend this intuition for noisy- and semi- supervised LtR scenarios: some human annotations can be noisy or out-of-date, and so are machine-generated pseudo-labels in semi- supervised scenarios. Dropping out such unreliable labels would contribute to both scenarios. State-of-the-arts propose Influence Function (IF) for estimating how each training instance affects learn- ing, and we identify and overcome two challenges specific to LtR. 1) Non-convex ranking functions violate the assumptions required for the robustness of IF estimation. 2) The pairwise learning of LtR incurs quadratic estimation overhead. Our technical contributions are addressing these challenges: First, we revise estimation and data optimization to accommodate reduced reliability; Second, we devise a group-wise estimation, reducing cost yet keeping accuracy high. We validate the effectiveness of our approach in a wide range of ad-hoc information retrieval benchmarks and real-life search engine datasets in both noisy- and semi-supervised scenarios.

Skip Supplemental Material Section

Supplemental Material

3409256.3409824.mp4

mp4

96.8 MB

References

  1. Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased learning to rank with unbiased propensity estimation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 385--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Samyadeep Basu, Xuchen You, and Soheil Feizi. 2019. Second-Order Group Influence Functions for Black-Box Predictions. arXiv preprint arXiv:1911.00418 (2019).Google ScholarGoogle Scholar
  3. Sebastian Bruch. 2019. An Alternative Cross Entropy Loss for Learning-to-Rank. arXiv preprint arXiv:1911.09798 (2019).Google ScholarGoogle Scholar
  4. Christopher Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Gregory N Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning (ICML-05). 89--96.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning 11, 23--581 (2010), 81.Google ScholarGoogle Scholar
  6. Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Charles L Clarke, Nick Craswell, and Ian Soboroff. 2009. Overview of the trec 2009 web track. Technical Report. WATERLOO UNIV (ONTARIO).Google ScholarGoogle Scholar
  8. Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2015. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 (2015).Google ScholarGoogle Scholar
  9. Faïza Dammak, Hager Kammoun, and Abdelmajid Ben Hamadou. 2017. Improving pairwise learning to rank algorithms for document retrieval. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  10. Dany Haddad and Joydeep Ghosh. 2019. Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 857--860.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 133--142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1885--1894.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Pang Wei W Koh, Kai-Siang Ang, Hubert Teo, and Percy S Liang. 2019. On the accuracy of influence functions for measuring group effects. In Advances in Neural Information Processing Systems. 5254--5264.Google ScholarGoogle Scholar
  14. Zhigao Miao, Juan Wang, Aimin Zhou, and Ke Tang. 2015. Regularized boost for semi-supervised ranking. In Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems, Volume 1. Springer, 643--651.Google ScholarGoogle ScholarCross RefCross Ref
  15. Dae Hoon Park and Yi Chang. 2019. Adversarial Sampling and Training for Semi-Supervised Information Retrieval. In TheWorld WideWeb Conference. ACM, 1443--1453.Google ScholarGoogle Scholar
  16. Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets. CoRRabs/1306.2597 (2013). http://arxiv.org/abs/1306.2597Google ScholarGoogle Scholar
  17. Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).Google ScholarGoogle Scholar
  18. Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017. Irgan: A minimax game for unifying generative and discriminative information retrieval models. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 515--524.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tianyang Wang, Jun Huan, and Bo Li. 2018. Data dropout: Optimizing training data for convolutional neural networks. In 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 39--46.Google ScholarGoogle ScholarCross RefCross Ref
  20. XuanhuiWang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018. The lambdaloss framework for ranking metric optimization. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1313--1322.Google ScholarGoogle Scholar
  21. Zifeng Wang, Hong Zhu, Zhenhua Dong, Xiuqiang He, and Shao-Lun Huang. 2019. Less Is Better: Unweighted Data Subsampling via Influence Function. arXiv preprint arXiv:1912.01321 (2019).Google ScholarGoogle Scholar
  22. Jingfang Xu, Chuanliang Chen, Gu Xu, Hang Li, and Elbio Renato Torres Abib. 2010. Improving quality of training data for learning to rank using click-through data. In Proceedings of the third ACM international conference on Web search and data mining. 171--180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Minjie Xu and Gary Kazantsev. 2019. Understanding Goal-Oriented Active Learning via Influence Functions. arXiv preprint arXiv:1905.13183 (2019).Google ScholarGoogle Scholar
  24. Shipeng Yu, Deng Cai, Ji-Rong Wen, and Wei-Ying Ma. 2003. Improving pseudorelevance feedback in web information retrieval using web page segmentation. In Proceedings of the 12th international conference on World Wide Web. ACM, 11--18.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Training Data Optimization for Pairwise Learning to Rank

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICTIR '20: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval
        September 2020
        207 pages
        ISBN:9781450380676
        DOI:10.1145/3409256

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 September 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate209of482submissions,43%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader