Skip to main content

2022 | OriginalPaper | Buchkapitel

Minimum Wasserstein Distance Estimator Under Finite Location-Scale Mixtures

verfasst von : Qiong Zhang, Jiahua Chen

Erschienen in: Advances and Innovations in Statistics and Data Science

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

When a population exhibits heterogeneity, we often model it via a finite mixture: decompose it into several different but homogeneous subpopulations. Contemporary practice favors learning the mixtures by maximizing the likelihood for statistical efficiency and the convenient EM algorithm for numerical computation. Yet the maximum likelihood estimate (MLE) is not well defined for finite location-scale mixture in general. We hence investigate feasible alternatives to MLE such as minimum distance estimators. Recently, the Wasserstein distance has drawn increased attention in the machine learning community. It has intuitive geometric interpretation and is successfully employed in many new applications. Do we gain anything by learning finite location-scale mixtures via a minimum Wasserstein distance estimator (MWDE)? This chapter investigates this possibility in several respects. We find that the MWDE is consistent and derive a numerical solution under finite location-scale mixtures. We study its robustness against outliers and mild model mis-specifications. Our moderate scaled simulation study shows the MWDE suffers some efficiency loss against a penalized version of MLE in general without noticeable gain in robustness. We reaffirm the general superiority of the likelihood-based learning strategies even for the non-regular finite location-scale mixtures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. Preprint. arXiv:1701.07875. Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. Preprint. arXiv:1701.07875.
Zurück zum Zitat Bishop, C. M. (2006). Pattern recognition and machine learning. Springer. Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
Zurück zum Zitat Chen, J., & Tan, X. (2009). Inference for multivariate normal mixtures. Journal of Multivariate Analysis, 100(7), 1367–1383.CrossRef Chen, J., & Tan, X. (2009). Inference for multivariate normal mixtures. Journal of Multivariate Analysis, 100(7), 1367–1383.CrossRef
Zurück zum Zitat Chen, J., Tan, X., & Zhang, R. (2008). Inference for normal mixtures in mean and variance. Statistica Sinica, 18(2), 443–465. Chen, J., Tan, X., & Zhang, R. (2008). Inference for normal mixtures in mean and variance. Statistica Sinica, 18(2), 443–465.
Zurück zum Zitat Chen, J., Li, P., & Liu, G. (2020). Homogeneity testing under finite location-scale mixtures. Canadian Journal of Statistics, 48(4), 670–684.CrossRef Chen, J., Li, P., & Liu, G. (2020). Homogeneity testing under finite location-scale mixtures. Canadian Journal of Statistics, 48(4), 670–684.CrossRef
Zurück zum Zitat Choi, K. (1969). Estimators for the parameters of a finite mixture of distributions. Annals of the Institute of Statistical Mathematics, 21(1), 107–116.CrossRef Choi, K. (1969). Estimators for the parameters of a finite mixture of distributions. Annals of the Institute of Statistical Mathematics, 21(1), 107–116.CrossRef
Zurück zum Zitat Clark, A. (2015). Pillow (PIL Fork) documentation. Clark, A. (2015). Pillow (PIL Fork) documentation.
Zurück zum Zitat Clarke, B., & Heathcote, C. (1994). Robust estimation of k-component univariate normal mixtures. Annals of the Institute of Statistical Mathematics, 46(1), 83–93.CrossRef Clarke, B., & Heathcote, C. (1994). Robust estimation of k-component univariate normal mixtures. Annals of the Institute of Statistical Mathematics, 46(1), 83–93.CrossRef
Zurück zum Zitat Cutler, A., & Cordero-Brana, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models. Journal of the American Statistical Association, 91(436), 1716–1723.CrossRef Cutler, A., & Cordero-Brana, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models. Journal of the American Statistical Association, 91(436), 1716–1723.CrossRef
Zurück zum Zitat Deely, J., & Kruse, R. (1968). Construction of sequences estimating the mixing distribution. The Annals of Mathematical Statistics, 39(1), 286–288.CrossRef Deely, J., & Kruse, R. (1968). Construction of sequences estimating the mixing distribution. The Annals of Mathematical Statistics, 39(1), 286–288.CrossRef
Zurück zum Zitat Evans, S. N., & Matsen, F. A. (2012). The phylogenetic Kantorovich–Rubinstein metric for environmental sequence samples. Journal of the Royal Statistical Society: Series B (Methodological), 74(3), 569–592.CrossRef Evans, S. N., & Matsen, F. A. (2012). The phylogenetic Kantorovich–Rubinstein metric for environmental sequence samples. Journal of the Royal Statistical Society: Series B (Methodological), 74(3), 569–592.CrossRef
Zurück zum Zitat Farnoosh, R., & Zarpak, B. (2008). Image segmentation using Gaussian mixture model. IUST International Journal of Engineering Science, 19(1–2), 29–32. Farnoosh, R., & Zarpak, B. (2008). Image segmentation using Gaussian mixture model. IUST International Journal of Engineering Science, 19(1–2), 29–32.
Zurück zum Zitat Holzmann, H., Munk, A., & Stratmann, B. (2004). Identifiability of finite mixtures-with applications to circular distributions. Sankhyā: The Indian Journal of Statistics, 66(3), 440–449. Holzmann, H., Munk, A., & Stratmann, B. (2004). Identifiability of finite mixtures-with applications to circular distributions. Sankhyā: The Indian Journal of Statistics, 66(3), 440–449.
Zurück zum Zitat Kolouri, S., Rohde, G. K., & Hoffmann, H. (2018). Sliced Wasserstein distance for learning Gaussian mixture models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3427–3436). Kolouri, S., Rohde, G. K., & Hoffmann, H. (2018). Sliced Wasserstein distance for learning Gaussian mixture models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3427–3436).
Zurück zum Zitat Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer Science & Business Media. Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer Science & Business Media.
Zurück zum Zitat Pearson, K. (1894). Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London. A, 185(326-330), 71–110. Pearson, K. (1894). Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London. A, 185(326-330), 71–110.
Zurück zum Zitat Plataniotis, K. N., & Hatzinak, D. (2000). Gaussian mixtures and their applications to signal processing. In S. Stergiopoulos (Ed.), Advanced signal processing handbook: theory and implementation for radar, sonar, and medical imaging real time systems (vol. 25, chapter 3, pp. 3-1–3-35, 1st edn). Boca Raton: CRC Press. Plataniotis, K. N., & Hatzinak, D. (2000). Gaussian mixtures and their applications to signal processing. In S. Stergiopoulos (Ed.), Advanced signal processing handbook: theory and implementation for radar, sonar, and medical imaging real time systems (vol. 25, chapter 3, pp. 3-1–3-35, 1st edn). Boca Raton: CRC Press.
Zurück zum Zitat Santosh, D. H. H., Venkatesh, P., Poornesh, P., Rao, L. N., & Kumar, N. A. (2013). Tracking multiple moving objects using Gaussian mixture model. International Journal of Soft Computing and Engineering (IJSCE), 3(2), 114–119. Santosh, D. H. H., Venkatesh, P., Poornesh, P., Rao, L. N., & Kumar, N. A. (2013). Tracking multiple moving objects using Gaussian mixture model. International Journal of Soft Computing and Engineering (IJSCE), 3(2), 114–119.
Zurück zum Zitat Schork, N. J., Allison, D. B., & Thiel, B. (1996). Mixture distributions in human genetics research. Statistical Methods in Medical Research, 5(2), 155–178.CrossRefPubMed Schork, N. J., Allison, D. B., & Thiel, B. (1996). Mixture distributions in human genetics research. Statistical Methods in Medical Research, 5(2), 155–178.CrossRefPubMed
Zurück zum Zitat Tanaka, K. (2009). Strong consistency of the maximum likelihood estimator for finite mixtures of location–scale distributions when penalty is imposed on the ratios of the scale parameters. Scandinavian Journal of Statistics, 36(1), 171–184. Tanaka, K. (2009). Strong consistency of the maximum likelihood estimator for finite mixtures of location–scale distributions when penalty is imposed on the ratios of the scale parameters. Scandinavian Journal of Statistics, 36(1), 171–184.
Zurück zum Zitat Teicher, H. (1961). Identifiability of mixtures. The Annals of Mathematical Statistics, 32(1), 244–248.CrossRef Teicher, H. (1961). Identifiability of mixtures. The Annals of Mathematical Statistics, 32(1), 244–248.CrossRef
Zurück zum Zitat Van der Vaart, A. W. (2000). Asymptotic statistics (vol. 3). Cambridge University Press. Van der Vaart, A. W. (2000). Asymptotic statistics (vol. 3). Cambridge University Press.
Zurück zum Zitat Villani, C. (2003). Topics in optimal transportation (vol. 58). American Mathematical Society. Villani, C. (2003). Topics in optimal transportation (vol. 58). American Mathematical Society.
Zurück zum Zitat Woodward, W. A., Parr, W. C., Schucany, W. R., & Lindsey, H. (1984). A comparison of minimum distance and maximum likelihood estimation of a mixture proportion. Journal of the American Statistical Association, 79(387), 590–598.CrossRef Woodward, W. A., Parr, W. C., Schucany, W. R., & Lindsey, H. (1984). A comparison of minimum distance and maximum likelihood estimation of a mixture proportion. Journal of the American Statistical Association, 79(387), 590–598.CrossRef
Zurück zum Zitat Yakowitz, S. (1969). A consistent estimator for the identification of finite mixtures. The Annals of Mathematical Statistics, 40(5), 1728–1735.CrossRef Yakowitz, S. (1969). A consistent estimator for the identification of finite mixtures. The Annals of Mathematical Statistics, 40(5), 1728–1735.CrossRef
Zurück zum Zitat Zhu, D. (2016). A two-component mixture model for density estimation and classification. Journal of Interdisciplinary Mathematics, 19(2), 311–319.CrossRef Zhu, D. (2016). A two-component mixture model for density estimation and classification. Journal of Interdisciplinary Mathematics, 19(2), 311–319.CrossRef
Metadaten
Titel
Minimum Wasserstein Distance Estimator Under Finite Location-Scale Mixtures
verfasst von
Qiong Zhang
Jiahua Chen
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-031-08329-7_4

Premium Partner