Skip to main content
Erschienen in: Quality & Quantity 5/2019

04.04.2018

An integrated algorithm for three-way compositional data

verfasst von: Michele Gallo, Violetta Simonacci, Maria Anna Di Palma

Erschienen in: Quality & Quantity | Ausgabe 5/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Compositional data with a tridimensional structure are not uncommon in social sciences. The CANDECOMP/PARAFAC model is one of the most adequate techniques for modeling these arrays without confusing modes variability. Estimating parameters in this setting can be particularly difficult because compositional data are multicollinear by definition and because, in general, for socio-economic data the exact number of latent variables is harder to determine. The most used fitting procedure in the literature is the PARAFAC-ALS algorithm which, however, is sensitive to both the difficulties presented, namely it is sensitive to multicollinearity and to the use of the wrong number of factors. In this work an integrated PARAFAC-ALS algorithm initialized with SWATLD steps is proposed as an effective solution to these deficiencies. This approach is tested on simulated multicollinear data in comparison with standard ALS and proves capable of performing better in terms of robustness against over-factoring and temporary degeneracies, it is faster at converging even in case of collinearity and it still provides a least-squares solution.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aitchison, J.: The Statistical Analysis of Compositional Data. Chapman & Hall, London (1986)CrossRef Aitchison, J.: The Statistical Analysis of Compositional Data. Chapman & Hall, London (1986)CrossRef
Zurück zum Zitat Aitchison, J.: Logratios and natural laws in compositional data analysis. Math. Geol. 31(5), 563–580 (1999)CrossRef Aitchison, J.: Logratios and natural laws in compositional data analysis. Math. Geol. 31(5), 563–580 (1999)CrossRef
Zurück zum Zitat Andersson, C.A., Bro, R.: Improving the speed of multi-way algorithms Part II: compression. Chemometr. Intell. Lab. Syst. 42(1), 105–113 (1998) Andersson, C.A., Bro, R.: Improving the speed of multi-way algorithms Part II: compression. Chemometr. Intell. Lab. Syst. 42(1), 105–113 (1998)
Zurück zum Zitat Billheimer, D., Guttorp, P., Fagan, W.F.: Statistical interpretation of species composition. J. Am. Stat. Assoc. 96(456), 1205–1214 (2001)CrossRef Billheimer, D., Guttorp, P., Fagan, W.F.: Statistical interpretation of species composition. J. Am. Stat. Assoc. 96(456), 1205–1214 (2001)CrossRef
Zurück zum Zitat Bro, R.: PARAFAC. Tutorial and applications. Chemometr. Intell. Lab. Syst. 38(2), 149–171 (1997)CrossRef Bro, R.: PARAFAC. Tutorial and applications. Chemometr. Intell. Lab. Syst. 38(2), 149–171 (1997)CrossRef
Zurück zum Zitat Bro, R.: Multi-way Analysis in the Food Industry. Models Algorithms and Applications. University of Amsterdam, Amsterdam (1998) Bro, R.: Multi-way Analysis in the Food Industry. Models Algorithms and Applications. University of Amsterdam, Amsterdam (1998)
Zurück zum Zitat Carroll, J.D., Chang, J.J.: Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrika 35(3), 283319 (1970)CrossRef Carroll, J.D., Chang, J.J.: Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrika 35(3), 283319 (1970)CrossRef
Zurück zum Zitat Ceulemans, E., Kiers, H.A.: Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. Br. J. Math. Stat. Psychol. 59(1), 133–150 (2006)CrossRef Ceulemans, E., Kiers, H.A.: Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. Br. J. Math. Stat. Psychol. 59(1), 133–150 (2006)CrossRef
Zurück zum Zitat Chen, Z.P., Wu, H.L., Jiang, J.H., Li, Y., Yu, R.Q.: A novel trilinear decomposition algorithm for second-order linear calibration. Chemometr. Intell. Lab. Syst. 52(1), 75–86 (2000)CrossRef Chen, Z.P., Wu, H.L., Jiang, J.H., Li, Y., Yu, R.Q.: A novel trilinear decomposition algorithm for second-order linear calibration. Chemometr. Intell. Lab. Syst. 52(1), 75–86 (2000)CrossRef
Zurück zum Zitat Egozcue, J.J., Barcelo-Vidal, C., Martín-Fernández, J.A., Jarauta-Bragulat, E., Díaz-Barrero, J.L., Mateu-Figueras, G., Pawlowsky-Glahn, V., Buccianti, A.: Elements of simplicial linear algebra and geometry. In: Compositional Data Analysis: Theory and Applications, pp. 141–157 (2011) Egozcue, J.J., Barcelo-Vidal, C., Martín-Fernández, J.A., Jarauta-Bragulat, E., Díaz-Barrero, J.L., Mateu-Figueras, G., Pawlowsky-Glahn, V., Buccianti, A.: Elements of simplicial linear algebra and geometry. In: Compositional Data Analysis: Theory and Applications, pp. 141–157 (2011)
Zurück zum Zitat Engle, M.A., Gallo, M., Schroeder, K.T., Geboy, N.J., Zupancic, J.W.: Three-way compositional analysis of water quality monitoring data. Environ. Ecol. Stat. 21(3), 565–581 (2014)CrossRef Engle, M.A., Gallo, M., Schroeder, K.T., Geboy, N.J., Zupancic, J.W.: Three-way compositional analysis of water quality monitoring data. Environ. Ecol. Stat. 21(3), 565–581 (2014)CrossRef
Zurück zum Zitat EU.: Stepping Up the Fight Against Undeclared Work, p. 628. European Commision, Bruxelles (2007) EU.: Stepping Up the Fight Against Undeclared Work, p. 628. European Commision, Bruxelles (2007)
Zurück zum Zitat Faber, N.K.M., Bro, R., Hopke, P.K.: Recent developments in CANDECOMP/PARAFAC algorithms: a critical review. Chemometr. Intell. Lab. Syst. 65(1), 119–137 (2003)CrossRef Faber, N.K.M., Bro, R., Hopke, P.K.: Recent developments in CANDECOMP/PARAFAC algorithms: a critical review. Chemometr. Intell. Lab. Syst. 65(1), 119–137 (2003)CrossRef
Zurück zum Zitat Gallo, M.: Discriminant partial least squares analysis on compositional data. Stat. Model. 10(1), 41–56 (2010)CrossRef Gallo, M.: Discriminant partial least squares analysis on compositional data. Stat. Model. 10(1), 41–56 (2010)CrossRef
Zurück zum Zitat Gallo, M.: Tucker3 model for compositional data. Commun. Stat. Theory Methods 44(21), 4441–4453 (2015)CrossRef Gallo, M.: Tucker3 model for compositional data. Commun. Stat. Theory Methods 44(21), 4441–4453 (2015)CrossRef
Zurück zum Zitat Gallo, M., Simonacci, V.: A procedure for the three-mode analysis of compositions. Electron. J. Appl. Stat. Anal. 6(2), 202–210 (2013) Gallo, M., Simonacci, V.: A procedure for the three-mode analysis of compositions. Electron. J. Appl. Stat. Anal. 6(2), 202–210 (2013)
Zurück zum Zitat Giordani, P., Kiers, H.A., Del Ferraro, M.A.: Three-way component analysis using the r package threeway. J. Stat. Softw. 57(7), 1–23 (2014)CrossRef Giordani, P., Kiers, H.A., Del Ferraro, M.A.: Three-way component analysis using the r package threeway. J. Stat. Softw. 57(7), 1–23 (2014)CrossRef
Zurück zum Zitat ISTAT: Note metodologiche: la misura dell’occupazione non regolare nelle stime di contabilitá nazionale. Roma. www.istat.it (2011) ISTAT: Note metodologiche: la misura dell’occupazione non regolare nelle stime di contabilitá nazionale. Roma. www.​istat.​it (2011)
Zurück zum Zitat Kiers, H.A.: A three-step algorithm for CANDECOMP/PARAFAC analysis of large data sets with multicollinearity. J. Chemom. 12(3), 155–171 (1998)CrossRef Kiers, H.A.: A three-step algorithm for CANDECOMP/PARAFAC analysis of large data sets with multicollinearity. J. Chemom. 12(3), 155–171 (1998)CrossRef
Zurück zum Zitat Kiers, H.A.: Some procedures for displaying results from three-way methods. J. Chemom. 14(3), 151–170 (2000)CrossRef Kiers, H.A.: Some procedures for displaying results from three-way methods. J. Chemom. 14(3), 151–170 (2000)CrossRef
Zurück zum Zitat Kruskal, J.B.: Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra Appl. 18(2), 95–138 (1977)CrossRef Kruskal, J.B.: Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra Appl. 18(2), 95–138 (1977)CrossRef
Zurück zum Zitat Kruskal, J.B.: Rank, decomposition, and uniqueness for 3-way and N-way arrays. In: Coppi, R., Bolasco, S. (eds.) Multiway Data Analysis, pp. 7–18. North-Holland Publishing Co., Amsterdam (1989) Kruskal, J.B.: Rank, decomposition, and uniqueness for 3-way and N-way arrays. In: Coppi, R., Bolasco, S. (eds.) Multiway Data Analysis, pp. 7–18. North-Holland Publishing Co., Amsterdam (1989)
Zurück zum Zitat Mitchell, B.C., Burdick, D.S.: An empirical comparison of resolution met-hods for three-way arrays. Chemometr. Intell. Lab. Syst. 20(2), 149–161 (1993)CrossRef Mitchell, B.C., Burdick, D.S.: An empirical comparison of resolution met-hods for three-way arrays. Chemometr. Intell. Lab. Syst. 20(2), 149–161 (1993)CrossRef
Zurück zum Zitat Mitchell, B.C., Burdick, D.S.: Slowly converging parafac sequences: Swamps and two-factor degeneracies. J. Chemom. 8(2), 155–168 (1994)CrossRef Mitchell, B.C., Burdick, D.S.: Slowly converging parafac sequences: Swamps and two-factor degeneracies. J. Chemom. 8(2), 155–168 (1994)CrossRef
Zurück zum Zitat Olivieri, A.C.: Recent advances in analytical calibration with multi-way data. Anal. Methods 4(7), 1876–1886 (2012)CrossRef Olivieri, A.C.: Recent advances in analytical calibration with multi-way data. Anal. Methods 4(7), 1876–1886 (2012)CrossRef
Zurück zum Zitat Pawlowsky-Glahn, V., Egozcue, J.J.: Geometric approach to statistical analysis on the simplex. Stoch. Env. Res. Risk Assess. 15(5), 384–398 (2001)CrossRef Pawlowsky-Glahn, V., Egozcue, J.J.: Geometric approach to statistical analysis on the simplex. Stoch. Env. Res. Risk Assess. 15(5), 384–398 (2001)CrossRef
Zurück zum Zitat Pawlowsky-Glahn, V., Egozcue, J.J., Tolosana-Delgado, R.: Modeling and Analysis of Compositional Data. Wiley, Hoboken (2015) Pawlowsky-Glahn, V., Egozcue, J.J., Tolosana-Delgado, R.: Modeling and Analysis of Compositional Data. Wiley, Hoboken (2015)
Zurück zum Zitat Sidiropoulos, N.D., Bro, R.: On the uniqueness of multilinear decomposition of N-way arrays. J. Chemom. 14(3), 229–239 (2000)CrossRef Sidiropoulos, N.D., Bro, R.: On the uniqueness of multilinear decomposition of N-way arrays. J. Chemom. 14(3), 229–239 (2000)CrossRef
Zurück zum Zitat Timmerman, M.E., Kiers, H.A.: Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. Br. J. Math. Stat. Psychol. 53(1), 1–16 (2000)CrossRef Timmerman, M.E., Kiers, H.A.: Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. Br. J. Math. Stat. Psychol. 53(1), 1–16 (2000)CrossRef
Zurück zum Zitat Tomasi, G., Bro, R.: A comparison of algorithms for fitting the PARAFAC model. Comput. Stat. Data Anal. 50(7), 1700–1734 (2006)CrossRef Tomasi, G., Bro, R.: A comparison of algorithms for fitting the PARAFAC model. Comput. Stat. Data Anal. 50(7), 1700–1734 (2006)CrossRef
Zurück zum Zitat Yu, Y.J., Wu, H.L., Nie, J.F., Zhang, S.R., Li, S.F., Li, Y.N., Zhu, S.H., Yu, R.Q.: A comparison of several trilinear second-order calibration algorithms. Chemometr. Intell. Lab. Syst. 106(1), 93–107 (2011)CrossRef Yu, Y.J., Wu, H.L., Nie, J.F., Zhang, S.R., Li, S.F., Li, Y.N., Zhu, S.H., Yu, R.Q.: A comparison of several trilinear second-order calibration algorithms. Chemometr. Intell. Lab. Syst. 106(1), 93–107 (2011)CrossRef
Zurück zum Zitat Zhang, S.R., Wu, H.L., Yu, R.Q.: A study on the differential strategy of some iterative trilinear decomposition algorithms: PARAFAC-ALS, ATLD, SWATLD, and APTLD. J. Chemom. 29(3), 179–192 (2015)CrossRef Zhang, S.R., Wu, H.L., Yu, R.Q.: A study on the differential strategy of some iterative trilinear decomposition algorithms: PARAFAC-ALS, ATLD, SWATLD, and APTLD. J. Chemom. 29(3), 179–192 (2015)CrossRef
Zurück zum Zitat Zijlstra, B.J., Kiers, H.A.: Degenerate solutions obtained from several variants of factor analysis. J. Chemom. 16(11), 596–605 (2002)CrossRef Zijlstra, B.J., Kiers, H.A.: Degenerate solutions obtained from several variants of factor analysis. J. Chemom. 16(11), 596–605 (2002)CrossRef
Metadaten
Titel
An integrated algorithm for three-way compositional data
verfasst von
Michele Gallo
Violetta Simonacci
Maria Anna Di Palma
Publikationsdatum
04.04.2018
Verlag
Springer Netherlands
Erschienen in
Quality & Quantity / Ausgabe 5/2019
Print ISSN: 0033-5177
Elektronische ISSN: 1573-7845
DOI
https://doi.org/10.1007/s11135-018-0745-2

Weitere Artikel der Ausgabe 5/2019

Quality & Quantity 5/2019 Zur Ausgabe

Premium Partner