Abstract
The task of quantification consists in providing an aggregate estimation (e.g., the class distribution in a classification problem) for unseen test sets, applying a model that is trained using a training set with a different data distribution. Several real-world applications demand this kind of method that does not require predictions for individual examples and just focuses on obtaining accurate estimates at an aggregate level. During the past few years, several quantification methods have been proposed from different perspectives and with different goals. This article presents a unified review of the main approaches with the aim of serving as an introductory tutorial for newcomers in the field.
- Rocío Alaiz-Rodríguez, Enrique Alegre-Gutiérrez, Víctor González-Castro, and Lidia Sánchez. 2008. Quantifying the proportion of damaged sperm cells based on image analysis and neural networks. In Proceedings of the WSEAS International Conference on Simulation, Modelling and Optimization (SMO’08). WSEAS Press, 383--388Google Scholar
- Rocio Alaiz-Rodríguez, Alicia Guerrero-Curieses, and Jesús Cid-Sueiro. 2011. Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift. Neurocomputing 74, 16 (2011), 2614--2623.Google ScholarCross Ref
- Giambattista Amati, Simone Angelini, Marco Bianchi, Luca Costantini, and Giuseppe Marcone. 2014a. A scalable approach to near real-time sentiment analysis on social networks. In Proceedings of the International Workshop on Information Filtering and Retrieval. 12--23.Google Scholar
- Giambattista Amati, Marco Bianchi, and Giuseppe Marcone. 2014b. Sentiment estimation on twitter. In Proceedings of the 5th Italian Information Retrieval Workshop (2014). 39--50.Google Scholar
- Jon Scott Armstrong. 1978. Long-range Forecasting: From Crystal Ball to Computer. Wiley: New York.Google Scholar
- Hideki Asoh, Kazushi Ikeda, and Chihiro Ono. 2012. A fast and simple method for profiling a population of twitter users. In Proceedings of the 3rd International Workshop on Mining Ubiquitous and Social Environments. 19--26.Google Scholar
- Jose Barranquero, Jorge Díez, and Juan José del Coz. 2015. Quantification-oriented learning based on reliable classifiers. Pattern Recogn. 48, 2 (2015), 591--604. Google ScholarDigital Library
- Jose Barranquero, Pablo González, Jorge Díez, and Juan José del Coz. 2013. On the study of nearest neighbour algorithms for prevalence estimation in binary problems. Pattern Recogn. 46, 2 (2013), 472—482.Google ScholarDigital Library
- Oscar Beijbom, Judy Hoffman, Evan Yao, Trevor Darrell, Alberto Rodriguez-Ramirez, Manuel Gonzalez-Rivero, and Ove Hoegh Guldberg. 2015. Quantification in-the-wild: Data-sets and baselines. In Proceedings of the Workshop on Transfer and Multi-Task Learning (NIPS’15).Google Scholar
- Antonio Bella, Cèsar Ferri, José Hernández-Orallo, and María José Ramírez-Quintana. 2010. Quantification via probability estimators. In Proceedings of the IEEE International Conference on Data Mining (ICDM’10). IEEE, 737--742.Google ScholarDigital Library
- Antonio Bella, Cèsar Ferri, José Hernández-Orallo, and María José Ramírez-Quintana. 2014. Aggregative quantification for regression. Data Min. Knowl. Discov. 28, 2 (2014), 475--518.Google ScholarDigital Library
- J. Roger Bray and John T. Curtis. 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27, 4 (1957), 325--349. Google ScholarCross Ref
- Yee Seng Chan and Hwee Tou Ng. 2006. Estimating class priors in domain adaptation for word sense disambiguation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 89--96. Google ScholarDigital Library
- Hal Daume III and Daniel Marcu. 2006. Domain adaptation for statistical classifiers. J. Artif. Intell. Res. 26 (2006), 101--126.Google ScholarCross Ref
- Marthinus Christoffel du Plessis and Masashi Sugiyama. 2012. Semi-supervised learning of class balance under class-prior change by distribution matching. In Proceedings of the International Conference on Machine Learning (ICML’12).Google Scholar
- Marthinus Christoffel Du Plessis and Masashi Sugiyama. 2014a. Class prior estimation from positive and unlabeled data. IEICE Trans. Inf. Syst. 97, 5 (2014), 1358--1362. Google ScholarCross Ref
- Marthinus Christoffel Du Plessis and Masashi Sugiyama. 2014b. Semi-supervised learning of class balance under class-prior change by distribution matching. Neur. Netw. 50 (2014), 110--119. Google ScholarDigital Library
- Andrea Esuli and Fabrizio Sebastiani. 2010. Sentiment quantification. IEEE Intell. Syst. 25, 4 (2010), 72--75. Google ScholarDigital Library
- Andrea Esuli and Fabrizio Sebastiani. 2015. Optimizing text quantifiers for multivariate loss functions. ACM Trans. Knowl. Discov. Data 9, 4 (2015), 27:1--27:27.Google ScholarDigital Library
- Tom Fawcett. 2004. ROC graphs: Notes and practical considerations for researchers. Mach. Learn. 31 (2004), 1--38.Google Scholar
- Tom Fawcett and Peter A. Flach. 2005. A response to webb and ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58, 1 (2005), 33--38. Google ScholarDigital Library
- Aykut Firat. 2016. Unified framework for quantification. arXiv preprint arXiv:1606.00868 (2016).Google Scholar
- George Forman. 2005. Counting positives accurately despite inaccurate classification. In Proceedings of the European Conference on Machine Learning (ECML’05). 564--575. Google ScholarDigital Library
- George Forman. 2006. Quantifying trends accurately despite classifier error and class imbalance. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’06). ACM, 157--166. Google ScholarDigital Library
- George Forman. 2008. Quantifying counts and costs via classification. Data Min. Knowl. Discov. 17, 2 (2008), 164--206. Google ScholarDigital Library
- George Forman, Evan Kirshenbaum, and Jaap Suermondt. 2006. Pragmatic text mining: Minimizing human effort to quantify many issues in call logs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’06). ACM, 852--861.Google ScholarDigital Library
- James Foulds and Eibe Frank. 2010. A review of multi-instance learning assumptions. Knowl. Eng. Rev. 25, 01 (2010), 1--25. Google ScholarDigital Library
- Eibe Frank and Mark Hall. 2001. A simple approach to ordinal classification. In Proceedings of the European Conference on Machine Learning. Springer, 145--156. Google ScholarDigital Library
- João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM Comput. Surv. 46, 4 (2014), 44.Google ScholarDigital Library
- Wei Gao and Fabrizio Sebastiani. 2015. Tweet sentiment: From classification to quantification. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15).Google ScholarDigital Library
- Wei Gao and Fabrizio Sebastiani. 2016. From classification to quantification in tweet sentiment analysis. Soc. Netw. Anal. Min. 6, 1 (2016), 1--22. Google ScholarCross Ref
- John J. Gart and Alfred A. Buck. 1966. Comparison of a screening test and a reference test in epidemiologic studies ii. A probabilistic model for the comparison of diagnostic tests. Am. J. Epidemiol. 83, 3 (1966), 593--602. Google ScholarCross Ref
- Anastasia Giachanou and Fabio Crestani. 2016. Like it or not: A survey of twitter sentiment analysis methods. Comput. Surv. 49, 2 (2016), 28:1--28:41.Google ScholarDigital Library
- Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter sentiment classification using distant supervision. CS224N Project Rep. Stanford 1 (2009), 12.Google Scholar
- Pablo González, Eva álvarez, Jose Barranquero, Jorge Díez, Rafael González-Quirós, Enrique Nogueira, Angel López-Urrutia, and Juan José del Coz. 2013. Multiclass support vector machines with example-dependent costs applied to plankton biomass estimation. IEEE Trans. Neur. Netw. Learn. Syst. 24, 11 (2013), 1901--1905.Google ScholarCross Ref
- Pablo González, Eva álvarez, Jorge Díez, ángel López-Urrutia, and Juan José del Coz. 2017. Validation methods for plankton image classification systems. Limnol. Oceanogr. Methods 15, 3 (2017), 221--237.Google ScholarCross Ref
- Pablo González, Jorge Díez, Nitesh Chawla, and Juan José del Coz. 2017. Why is quantification an interesting learning problem?Progr. Artif. Intell. 6, 1 (2017), 53--58.Google Scholar
- Víctor González-Castro, Rocío Alaiz-Rodríguez, and Enrique Alegre. 2013. Class distribution estimation based on the hellinger distance. Inf. Sci. 218 (2013), 146--164.Google ScholarDigital Library
- Vera Hofer. 2015. Adapting a classification rule to local and global shift when only unlabelled data are available. Eur. J. Operat. Res. 243, 1 (2015), 177--189. Google ScholarCross Ref
- Vera Hofer and Georg Krempl. 2013. Drift mining in data: A framework for addressing drift in classification. Comput. Stat. Data Anal. 57, 1 (2013), 377--391. Google ScholarDigital Library
- Daniel J. Hopkins and Gary King. 2010. A method of automated nonparametric content analysis for social science. Am. J. Polit. Sci. 54, 1 (2010), 229--247. Google ScholarCross Ref
- Jiayuan Huang, Alex J. Smola, Arthur Gretton, Karsten Borgwardt, and Bernhard Schölkopf. 2007. Correcting sample selection bias by unlabeled data. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’07). The MIT Press, 601--608.Google Scholar
- Arun Iyer, Saketha Nath, and Sunita Sarawagi. 2014. Maximum mean discrepancy for class ratio estimation: Convergence bounds and kernel selection. In Proceedings of the International Conference on Machine Learning (ICML’14). 530--538.Google Scholar
- Thorsten Joachims. 2005. A support vector method for multivariate performance measures. In Proceedings of the International Conference on Machine Learning (ICML’05). ACM, 377--384. Google ScholarDigital Library
- Hideko Kawakubo, Marthinus Christoffel Du Plessis, and Masashi Sugiyama. 2016. Computationally efficient class-prior estimation under class balance change using energy distance. Trans. Inf. Syst. 99, 1 (2016), 176--186. Google ScholarCross Ref
- Gary King and Ying Lu. 2008. Verbal autopsy methods with multiple causes of death. Statist. Sci. 23, 1 (2008), 78--91. Google ScholarCross Ref
- Meelis Kull and Peter Flach. 2014. Patterns of dataset shift. In Proceedings of the 1st International Workshop on Learning over Multiple Contexts (LMCE’14) at ECML-PKDD.Google Scholar
- Paul S. Levy and Edward H. Kass. 1970. A three-population model for sequential screening for bacteriuria. Am. J. Epidemiol. 91, 2 (1970), 148--154. Google ScholarCross Ref
- Giovanni Da San Martino, Wei Gao, and Fabrizio Sebastiani. 2016a. Ordinal text quantification. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 937--940. Google ScholarDigital Library
- Giovanni Da San Martino, Wei Gao, and Fabrizio Sebastiani. 2016b. QCRI at SemEval-2016 Task 4: Probabilistic methods for binary and ordinal quantification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). Association for Computational Linguistics, A, 58--63.Google ScholarCross Ref
- Letizia Milli, Anna Monreale, Giulio Rossetti, Fosca Giannotti, Dino Pedreschi, and Fabrizio Sebastiani. 2013. Quantification trees. In Proceedings of the IEEE International Conference on Data Mining (ICDM’13). 528--536. Google ScholarCross Ref
- Letizia Milli, Anna Monreale, Giulio Rossetti, Dino Pedreschi, Fosca Giannotti, and Fabrizio Sebastiani. 2015. Quantification in social networks. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics. 1--10. Google ScholarCross Ref
- José G. Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recogn. 45, 1 (2012), 521--530.Google ScholarDigital Library
- Harikrishna Narasimhan, Shuai Li, Purushottam Kar, Sanjay Chawla, and Fabrizio Sebastiani. 2016. Stochastic optimization techniques for quantification performance measures. (unpublished).Google Scholar
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google ScholarDigital Library
- Pablo Pérez-Gállego, José Ramón Quevedo, and Juan José del Coz. 2017. Using ensembles for problems with characterizable changes in data distribution: A case study on quantification. Inf. Fusion 34 (2017), 87--100.Google ScholarDigital Library
- Charles Peters and William A Coberly. 1976. The numerical evaluation of the maximum-likelihood estimate of mixture proportions. Commun. Stat.-Theory. Methods 5, 12 (1976), 1127--1135. Google ScholarCross Ref
- Foster Provost and Tom Fawcett. 2001. Robust classification for imprecise environments. Mach. Learn. 42, 3 (2001), 203--231. Google ScholarDigital Library
- Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. 2000. The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 2 (2000), 99--121. Google ScholarDigital Library
- Marco Saerens, Patrice Latinne, and Christine Decaestecker. 2002. Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure. Neur. Comput. 14, 1 (2002), 21--41. Google ScholarDigital Library
- Andrew Solow, Cabell Davis, and Qiao Hu. 2001. Estimating the taxonomic composition of a sample when individuals are classified with error. Mar. Ecol.: Prog. Ser. 216 (2001), 309--311. Google ScholarCross Ref
- Heidi M. Sosik and Robert J. Olson. 2007. Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry. Limnol. Oceanogr.: Methods 5, 6 (2007), 204--216. Google ScholarCross Ref
- Amos J. Storkey. 2009. Dataset Shift in Machine Learning. The MIT Press, 3--28.Google Scholar
- Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, Marthinus Christoffel du Plessis, Song Liu, and Ichiro Takeuchi. 2013. Density-difference estimation. Neur. Comput. 25, 10 (2013), 2734--2775. Google ScholarDigital Library
- Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul von Bünau, and Motoaki Kawanabe. 2007. Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’07).Google Scholar
- Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. 2012. Density Ratio Estimation in Machine Learning. Cambridge University Press. Google ScholarCross Ref
- Masashi Sugiyama, Makoto Yamada, and Marthinus Christoffel du Plessis. 2013. Learning under nonstationarity: Covariate shift and class-balance change. Wiley Interdisc. Rev.: Comput. Stat. 5, 6 (2013), 465--477. Google ScholarDigital Library
- Lei Tang, Huiji Gao, and Huan Liu. 2010. Network quantification despite biased labels. In Proceedings of the 8th Workshop on Mining and Learning with Graphs (MLG’10) at ACM SIGKDD’10. ACM, 147--154. Google ScholarDigital Library
- Dirk Tasche. 2014. Exact fit of simple finite mixture models. J. Risk Financ. Manag. 7, 4 (2014), 150--164. Google ScholarCross Ref
- Dirk Tasche. 2016. Does quantification without adjustments work?arXiv preprint arXiv:1602.08780 (2016).Google Scholar
- Dirk Tasche. 2017. Fisher consistency for prior probability shift. arXiv preprint arXiv:1701.05512 (2017).Google Scholar
- Chris Tofallis. 2014. A better measure of relative prediction accuracy for model selection and model estimation. J. Operat. Res. Soc. 66, 8 (2014), 1352--1362. Google ScholarCross Ref
- Slobodan Vucetic and Zoran Obradovic. 2001. Classification on data with biased class distribution. In Proceedings of the European Conference on Machine Learning (ECML’01). Springer-Verlag, 527--538. Google ScholarDigital Library
- Geoffrey I. Webb, Roy Hyde, Hong Cao, Hai Long Nguyen, and Francois Petitjean. 2015. Characterizing concept drift. Data Min. Knowl. Discov. (2015), 1--31.Google Scholar
- Karl Weiss, Taghi M. Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. J. Big Data 3, 1 (2016), 1--40. Google ScholarCross Ref
- Jack Chongjie Xue and Gary M. Weiss. 2009. Quantification and semi-supervised classification methods for handling changes in class distribution. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’09). ACM, 897--906. Google ScholarDigital Library
- Kun Zhang, Bernhard Schölkopf, Krikamol Muandet, and Zhikun Wang. 2013. Domain adaptation under target and conditional shift. In Proceedings of the International Conference on Machine Learning (ICML’13). 819--827.Google Scholar
Index Terms
- A Review on Quantification Learning
Recommendations
Optimizing Text Quantifiers for Multivariate Loss Functions
We address the problem of quantification, a supervised learning task whose goal is, given a class, to estimate the relative frequency (or prevalence) of the class in a dataset of unlabeled items. Quantification has several applications in data and text ...
Quantifying counts and costs via classification
Many business applications track changes over time, for example, measuring the monthly prevalence of influenza incidents. In situations where a classifier is needed to identify the relevant incidents, imperfect classification accuracy can cause ...
Multi-Label Quantification
Quantification, variously called supervised prevalence estimation or learning to quantify, is the supervised learning task of generating predictors of the relative frequencies (a.k.a. prevalence values) of the classes of interest in unlabelled data ...
Comments