Skip to main content
Erschienen in: International Journal of Data Science and Analytics 1/2021

21.05.2020 | Regular Paper

Multi-task learning by hierarchical Dirichlet mixture model for sparse failure prediction

verfasst von: Simon Luo, Victor W. Chu, Zhidong Li, Yang Wang, Jianlong Zhou, Fang Chen, Raymond K. Wong

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sparsity and noisy labels occur inherently in real-world data. Previously, strong assumptions were made by domain experts to use their experience and expertise to select parameters for their models. Similar approach has been adopted in machine learning for hyper-parameter setting. However, these assumptions are often subjective and are not necessarily the optimal choice. To address this problem, we propose a data-driven approach to automate model parameter learning via a Bayesian nonparametric formulation. We propose hierarchical Dirichlet process mixture model (HDPMM) as a multi-task learning framework. It is used to learn the common parameters across different datasets in the same industry. In our experiments, we verified the capability of HDPMM for multi-task learning in infrastructure failure predictions. It was done by combining HDPMM with hierarchical beta process, which is our failure prediction model. In particular, multi-task learning was used to gain additional knowledge from failure records of water supply networks managed by other utility companies to improve prediction accuracy of our model. Notably, we have achieved superior accuracy for sparse predictions than previous state-of-the-art models. Moreover, we have demonstrated the capability of our proposed model in supporting preventive maintenance of critical infrastructure.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATH Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATH
2.
Zurück zum Zitat Bonilla, E.V., Chai, K.M.A., Williams, C.K.: Multi-task gaussian process prediction. NIPs 20, 153–160 (2007) Bonilla, E.V., Chai, K.M.A., Williams, C.K.: Multi-task gaussian process prediction. NIPs 20, 153–160 (2007)
3.
Zurück zum Zitat Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th international conference on Machine Learning, pp. 200–207. ACM (2008) Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th international conference on Machine Learning, pp. 200–207. ACM (2008)
4.
Zurück zum Zitat David, C.R., et al.: Regression models and life tables (with discussion). J. R. Stat. Soc. 34, 187–220 (1972) David, C.R., et al.: Regression models and life tables (with discussion). J. R. Stat. Soc. 34, 187–220 (1972)
6.
Zurück zum Zitat Gupta, S., Phung, D., Venkatesh, S.: Factorial multi-task learning: a Bayesian nonparametric approach. In: International Conference on Machine Learning, pp. 657–665 (2013) Gupta, S., Phung, D., Venkatesh, S.: Factorial multi-task learning: a Bayesian nonparametric approach. In: International Conference on Machine Learning, pp. 657–665 (2013)
7.
Zurück zum Zitat Hjort, N.L., et al.: Nonparametric bayes estimators based on beta processes in models for life history data. Ann. Stat. 18(3), 1259–1294 (1990)MathSciNetMATH Hjort, N.L., et al.: Nonparametric bayes estimators based on beta processes in models for life history data. Ann. Stat. 18(3), 1259–1294 (1990)MathSciNetMATH
8.
Zurück zum Zitat Huelsenbeck, J.P., Jain, S., Frost, S.W., Pond, S.L.K.: A dirichlet process model for detecting positive selection in protein-coding DNA sequences. Proc. Natl. Acad. Sci. 103(16), 6263–6268 (2006)CrossRef Huelsenbeck, J.P., Jain, S., Frost, S.W., Pond, S.L.K.: A dirichlet process model for detecting positive selection in protein-coding DNA sequences. Proc. Natl. Acad. Sci. 103(16), 6263–6268 (2006)CrossRef
9.
Zurück zum Zitat Ibrahim, J.G., Chen, M.H., Sinha, D.: Bayesian Survival Analysis. Wiley Online Library, New York (2005)MATH Ibrahim, J.G., Chen, M.H., Sinha, D.: Bayesian Survival Analysis. Wiley Online Library, New York (2005)MATH
10.
Zurück zum Zitat Kabir, G., Tesfamariam, S., Sadiq, R.: Predicting water main failures using bayesian model averaging and survival modelling approach. Reliab. Eng. Syst. Saf. 142, 498–514 (2015)CrossRef Kabir, G., Tesfamariam, S., Sadiq, R.: Predicting water main failures using bayesian model averaging and survival modelling approach. Reliab. Eng. Syst. Saf. 142, 498–514 (2015)CrossRef
11.
Zurück zum Zitat Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: AAAI, vol. 3, p. 5 (2006) Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: AAAI, vol. 3, p. 5 (2006)
12.
Zurück zum Zitat Kettler, A., Goulter, I.: An analysis of pipe breakage in urban water distribution networks. Can. J. Civ. Eng. 12(2), 286–293 (1985)CrossRef Kettler, A., Goulter, I.: An analysis of pipe breakage in urban water distribution networks. Can. J. Civ. Eng. 12(2), 286–293 (1985)CrossRef
13.
Zurück zum Zitat Kleiner, Y., Rajani, B.: Comprehensive review of structural deterioration of water mains: statistical models. Urban Water 3(3), 131–150 (2001)CrossRef Kleiner, Y., Rajani, B.: Comprehensive review of structural deterioration of water mains: statistical models. Urban Water 3(3), 131–150 (2001)CrossRef
14.
Zurück zum Zitat Kumar, A., Rizvi, S.A.A., Brooks, B., Vanderveld, R.A., Wilson, K.H., Kenney, C., Edelstein, S., Finch, A., Maxwell, A., Zuckerbraun, J., et al.: Using machine learning to assess the risk of and prevent water main breaks. (2018). arXiv preprint arXiv:1805.03597 Kumar, A., Rizvi, S.A.A., Brooks, B., Vanderveld, R.A., Wilson, K.H., Kenney, C., Edelstein, S., Finch, A., Maxwell, A., Zuckerbraun, J., et al.: Using machine learning to assess the risk of and prevent water main breaks. (2018). arXiv preprint arXiv:​1805.​03597
15.
Zurück zum Zitat Le Gat, Y., Eisenbeis, P.: Using maintenance records to forecast failures in water networks. Urban Water 2(3), 173–181 (2000)CrossRef Le Gat, Y., Eisenbeis, P.: Using maintenance records to forecast failures in water networks. Urban Water 2(3), 173–181 (2000)CrossRef
16.
Zurück zum Zitat Li, B., Zhang, B., Li, Z., Wang, Y., Chen, F., Vitanage, D.: Prioritising water pipes for condition assessment with data analytics. Australia’s International Water Conference & Exhibition (OzWater) (2015) Li, B., Zhang, B., Li, Z., Wang, Y., Chen, F., Vitanage, D.: Prioritising water pipes for condition assessment with data analytics. Australia’s International Water Conference & Exhibition (OzWater) (2015)
17.
Zurück zum Zitat Li, Z., Zhang, B., Wang, Y., Chen, F., Taib, R., Whiffin, V., Wang, Y.: Water pipe condition assessment: a hierarchical beta process approach for sparse incident data. Mach. Learn. 95(1), 11–26 (2014)MathSciNetCrossRef Li, Z., Zhang, B., Wang, Y., Chen, F., Taib, R., Whiffin, V., Wang, Y.: Water pipe condition assessment: a hierarchical beta process approach for sparse incident data. Mach. Learn. 95(1), 11–26 (2014)MathSciNetCrossRef
18.
Zurück zum Zitat Lin, P., Zhang, B., Wang, Y., Li, Z., Li, B., Wang, Y., Chen, F.: Data driven water pipe failure prediction: a Bayesian nonparametric approach. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp 193–202. ACM (2015) Lin, P., Zhang, B., Wang, Y., Li, Z., Li, B., Wang, Y., Chen, F.: Data driven water pipe failure prediction: a Bayesian nonparametric approach. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp 193–202. ACM (2015)
19.
Zurück zum Zitat Luo, S., Chu, V.W., Zhou, J., Chen, F., Wong, R.K., Huang, W.: A multivariate clustering approach for infrastructure failure predictions. In: 2017 IEEE International Congress on Big Data (BigData Congress), pp. 274–281. IEEE (2017) Luo, S., Chu, V.W., Zhou, J., Chen, F., Wong, R.K., Huang, W.: A multivariate clustering approach for infrastructure failure predictions. In: 2017 IEEE International Congress on Big Data (BigData Congress), pp. 274–281. IEEE (2017)
20.
Zurück zum Zitat Luo, S., Chu, V.W., Li, Z., Wang, Y., Zhou, J., Chen, F., Wong, R.K.: Multitask learning for sparse failure prediction. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 3–14. Springer, Berlin (2019) Luo, S., Chu, V.W., Li, Z., Wang, Y., Zhou, J., Chen, F., Wong, R.K.: Multitask learning for sparse failure prediction. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 3–14. Springer, Berlin (2019)
21.
Zurück zum Zitat Mailhot, A., Pelletier, G., Noël, J.F., Villeneuve, J.P.: Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: methodology and application. Water Resources Res. 36(10), 3053–3062 (2000)CrossRef Mailhot, A., Pelletier, G., Noël, J.F., Villeneuve, J.P.: Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: methodology and application. Water Resources Res. 36(10), 3053–3062 (2000)CrossRef
22.
Zurück zum Zitat Mavin, K.: Predicting the Failure Performance of Individual Water Mains. Urban Water Research Association of Australia, Sydney (1996) Mavin, K.: Predicting the Failure Performance of Individual Water Mains. Urban Water Research Association of Australia, Sydney (1996)
23.
Zurück zum Zitat Misiūnas, D.: Failure monitoring and asset condition assessment in water supply systems. Vilniaus Gedimino technikos universitetas, Vilnius (2008) Misiūnas, D.: Failure monitoring and asset condition assessment in water supply systems. Vilniaus Gedimino technikos universitetas, Vilnius (2008)
24.
Zurück zum Zitat Morris Jr., R.: Principal causes and remedies of water main breaks. J. Am. Water Works Assoc. 59(7), 782–798 (1967)CrossRef Morris Jr., R.: Principal causes and remedies of water main breaks. J. Am. Water Works Assoc. 59(7), 782–798 (1967)CrossRef
25.
Zurück zum Zitat Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)CrossRef Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)CrossRef
26.
Zurück zum Zitat Pelletier, G., Mailhot, A., Villeneuve, J.P.: Modeling water pipe breaks–three case studies. J. Water Resources Plan. Manag. 129(2), 115–123 (2003)CrossRef Pelletier, G., Mailhot, A., Villeneuve, J.P.: Modeling water pipe breaks–three case studies. J. Water Resources Plan. Manag. 129(2), 115–123 (2003)CrossRef
27.
Zurück zum Zitat Pitman, J., Yor, M., et al.: The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Ann. Probab. 25(2), 855–900 (1997)MathSciNetCrossRef Pitman, J., Yor, M., et al.: The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Ann. Probab. 25(2), 855–900 (1997)MathSciNetCrossRef
28.
Zurück zum Zitat Schwaighofer, A., Tresp, V., Yu, K.: Learning Gaussian process kernels via hierarchical bayes. In: Advances in Neural Information Processing Systems 17 (NIPS 2004), pp. 1209–1216 (2005) Schwaighofer, A., Tresp, V., Yu, K.: Learning Gaussian process kernels via hierarchical bayes. In: Advances in Neural Information Processing Systems 17 (NIPS 2004), pp. 1209–1216 (2005)
29.
Zurück zum Zitat Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994) Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)
30.
Zurück zum Zitat Shamir, U., Howard, C., et al.: An analytical approach to scheduling pipe replacement. J. Am. Water Works Assoc. 71(5), 248–258 (1979)CrossRef Shamir, U., Howard, C., et al.: An analytical approach to scheduling pipe replacement. J. Am. Water Works Assoc. 71(5), 248–258 (1979)CrossRef
31.
Zurück zum Zitat Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRef Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRef
32.
Zurück zum Zitat Thibaux, R., Jordan, M.I.: Hierarchical beta processes and the indian buffet process. AISTATS 2, 564–571 (2007) Thibaux, R., Jordan, M.I.: Hierarchical beta processes and the indian buffet process. AISTATS 2, 564–571 (2007)
33.
Zurück zum Zitat Xue, Y., Liao, X., Carin, L., Krishnapuram, B.: Multi-task learning for classification with dirichlet process priors. J. Mach. Learn. Res. 8(Jan), 35–63 (2007)MathSciNetMATH Xue, Y., Liao, X., Carin, L., Krishnapuram, B.: Multi-task learning for classification with dirichlet process priors. J. Mach. Learn. Res. 8(Jan), 35–63 (2007)MathSciNetMATH
Metadaten
Titel
Multi-task learning by hierarchical Dirichlet mixture model for sparse failure prediction
verfasst von
Simon Luo
Victor W. Chu
Zhidong Li
Yang Wang
Jianlong Zhou
Fang Chen
Raymond K. Wong
Publikationsdatum
21.05.2020
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 1/2021
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-020-00219-z

Weitere Artikel der Ausgabe 1/2021

International Journal of Data Science and Analytics 1/2021 Zur Ausgabe