Skip to main content
Top

2016 | OriginalPaper | Chapter

Learning Multi-faceted Activities from Heterogeneous Data with the Product Space Hierarchical Dirichlet Processes

Authors : Thanh-Binh Nguyen, Vu Nguyen, Svetha Venkatesh, Dinh Phung

Published in: Trends and Applications in Knowledge Discovery and Data Mining

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hierarchical Dirichlet processes (HDP) was originally designed and experimented for a single data channel. In this paper we enhanced its ability to model heterogeneous data using a richer structure for the base measure being a product-space. The enhanced model, called Product Space HDP (PS-HDP), can (1) simultaneously model heterogeneous data from multiple sources in a Bayesian nonparametric framework and (2) discover multilevel latent structures from data to result in different types of topics/latent structures that can be explained jointly. We experimented with the MDC dataset, a large and real-world data collected from mobile phones. Our goal was to discover identity–location–time (a.k.a who-where-when) patterns at different levels (globally for all groups and locally for each group). We provided analysis on the activities and patterns learned from our model, visualized, compared and contrasted with the ground-truth to demonstrate the merit of the proposed framework. We further quantitatively evaluated and reported its performance using standard metrics including F1-score, NMI, RI, and purity. We also compared the performance of the PS-HDP model with those of popular existing clustering methods (including K-Means, NNMF, GMM, DP-Means, and AP). Lastly, we demonstrate the ability of the model in learning activities with missing data, a common problem encountered in pervasive and ubiquitous computing applications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
2.
go back to reference Cao, L., Zhang, H., Zhao, Y., Luo, D., Zhang, C.: Combined mining: discovering informative knowledge in complex data. Trans. SMC 41(3), 699–712 (2011) Cao, L., Zhang, H., Zhao, Y., Luo, D., Zhang, C.: Combined mining: discovering informative knowledge in complex data. Trans. SMC 41(3), 699–712 (2011)
3.
go back to reference Do, T.M.T., Gatica-Perez, D.: Human interaction discovery in smartphone proximity networks. Pers. Ubiquit. Comput. 17(3), 413–431 (2013)CrossRef Do, T.M.T., Gatica-Perez, D.: Human interaction discovery in smartphone proximity networks. Pers. Ubiquit. Comput. 17(3), 413–431 (2013)CrossRef
4.
go back to reference Dousse, O., Eberle, J., Mertens, M.: Place learning via direct wifi fingerprint clustering. In: Mobile Data Management (MDM), pp. 282–287. IEEE (2012) Dousse, O., Eberle, J., Mertens, M.: Place learning via direct wifi fingerprint clustering. In: Mobile Data Management (MDM), pp. 282–287. IEEE (2012)
5.
7.
go back to reference Huynh, V., Phung, D., Nguyen, L., Venkatesh, S., Bui, H.H.: Learning conditional latent structures from multiple data sources. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS, vol. 9077, pp. 343–354. Springer, Heidelberg (2015) Huynh, V., Phung, D., Nguyen, L., Venkatesh, S., Bui, H.H.: Learning conditional latent structures from multiple data sources. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS, vol. 9077, pp. 343–354. Springer, Heidelberg (2015)
8.
go back to reference Kulis, B., Jordan, M.I.: Revisiting k-means: new algorithms via bayesian nonparametrics. In: Proceedings of the ICML (2012) Kulis, B., Jordan, M.I.: Revisiting k-means: new algorithms via bayesian nonparametrics. In: Proceedings of the ICML (2012)
9.
go back to reference Laurila, J.K., Gatica-Perez, D., Aad, I., Bornet, O., Do, T.M.T., Dousse, O., Eberle, J., Miettinen, M., et al.: The mobile data challenge: big data for mobile computing research. In: Pervasive Computing (2012) Laurila, J.K., Gatica-Perez, D., Aad, I., Bornet, O., Do, T.M.T., Dousse, O., Eberle, J., Miettinen, M., et al.: The mobile data challenge: big data for mobile computing research. In: Pervasive Computing (2012)
10.
go back to reference Lee, D.D., Seung, H., et al.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRef Lee, D.D., Seung, H., et al.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRef
11.
go back to reference Liang, P., Petrov, S., Jordan, M.I., Klein, D.: The infinite PCFG using hierarchical dirichlet processes. In: EMNLP 2007, pp. 688–697 (2007) Liang, P., Petrov, S., Jordan, M.I., Klein, D.: The infinite PCFG using hierarchical dirichlet processes. In: EMNLP 2007, pp. 688–697 (2007)
12.
go back to reference Liu, J.: The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. Am. Stat. Assoc. 89, 958–966 (1994)MathSciNetCrossRefMATH Liu, J.: The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. Am. Stat. Assoc. 89, 958–966 (1994)MathSciNetCrossRefMATH
13.
go back to reference McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2004)MATH McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2004)MATH
14.
go back to reference Nguyen, T.C., Phung, D., Gupta, S., Venkatesh, S.: Extraction of latent patterns and contexts from social honest signals using hierarchical Dirichlet processes. In: PERCOM, pp. 47–55 (2013) Nguyen, T.C., Phung, D., Gupta, S., Venkatesh, S.: Extraction of latent patterns and contexts from social honest signals using hierarchical Dirichlet processes. In: PERCOM, pp. 47–55 (2013)
15.
go back to reference Nguyen, T.B., Nguyen, T.C., Luo, W., Venkatesh, S., Phung, D.: Unsupervised inference of significant locations from wifi data for understanding human dynamics. In: Proceedings of MUM 2014, pp. 232–235 (2014) Nguyen, T.B., Nguyen, T.C., Luo, W., Venkatesh, S., Phung, D.: Unsupervised inference of significant locations from wifi data for understanding human dynamics. In: Proceedings of MUM 2014, pp. 232–235 (2014)
16.
go back to reference Nguyen, T., Phung, D., Venkatesh, S., Nguyen, X., Bui, H.: Bayesian nonparametric multilevel clustering with group-level contexts. In: ICML, pp. 288–296 (2014) Nguyen, T., Phung, D., Venkatesh, S., Nguyen, X., Bui, H.: Bayesian nonparametric multilevel clustering with group-level contexts. In: ICML, pp. 288–296 (2014)
17.
go back to reference Nguyen, V., Phung, D., Venkatesh, S., Bui, H.H.: A Bayesian nonparametric approach to multilevel regression. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS, vol. 9077, pp. 330–342. Springer, Heidelberg (2015) Nguyen, V., Phung, D., Venkatesh, S., Bui, H.H.: A Bayesian nonparametric approach to multilevel regression. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS, vol. 9077, pp. 330–342. Springer, Heidelberg (2015)
18.
go back to reference Pentland, A.: Automatic mapping and modeling of human networks. Phys. A: Stat. Mech. Appl. 378(1), 59–67 (2007)CrossRef Pentland, A.: Automatic mapping and modeling of human networks. Phys. A: Stat. Mech. Appl. 378(1), 59–67 (2007)CrossRef
19.
go back to reference Phung, D., Nguyen, X., Bui, H., Nguyen, T., Venkatesh, S.: Conditionally dependent Dirichlet processes for modelling naturally correlated data sources. Technical report, Pattern Recognition and Data Analytics, Deakin University (2012) Phung, D., Nguyen, X., Bui, H., Nguyen, T., Venkatesh, S.: Conditionally dependent Dirichlet processes for modelling naturally correlated data sources. Technical report, Pattern Recognition and Data Analytics, Deakin University (2012)
20.
go back to reference Ren, L., Dunson, D.B., Carin, L.: The dynamic hierarchical Dirichlet process. In: Proceedings of the 25th ICML 2008, pp. 824–831. ACM, New York (2008) Ren, L., Dunson, D.B., Carin, L.: The dynamic hierarchical Dirichlet process. In: Proceedings of the 25th ICML 2008, pp. 824–831. ACM, New York (2008)
21.
go back to reference Schilit, B.N., Theimer, M.M.: Disseminating active map information to mobile hosts. IEEE Netw. 8(5), 22–32 (1994)CrossRef Schilit, B.N., Theimer, M.M.: Disseminating active map information to mobile hosts. IEEE Netw. 8(5), 22–32 (1994)CrossRef
22.
23.
go back to reference Zhang, J., Song, Y., Zhang, C., Liu, S.: Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora. In: SIGKDD, pp. 1079–1088 (2010) Zhang, J., Song, Y., Zhang, C., Liu, S.: Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora. In: SIGKDD, pp. 1079–1088 (2010)
Metadata
Title
Learning Multi-faceted Activities from Heterogeneous Data with the Product Space Hierarchical Dirichlet Processes
Authors
Thanh-Binh Nguyen
Vu Nguyen
Svetha Venkatesh
Dinh Phung
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-42996-0_11

Premium Partner