Skip to main content

2019 | OriginalPaper | Buchkapitel

Consistency of the Fittest: Towards Dynamic Staleness Control for Edge Data Analytics

verfasst von : Atakan Aral, Ivona Brandic

Erschienen in: Euro-Par 2018: Parallel Processing Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A critical challenge for data stream processing at the edge of the network is the consistency of the machine learning models in distributed worker nodes. Especially in the case of non-stationary streams, which exhibit high degree of data set shift, mismanagement of models poses the risks of suboptimal accuracy due to staleness and ignored data. In this work, we analyze model consistency challenges of distributed online machine learning scenario and present preliminary solutions for synchronizing model updates. Additionally, we propose metrics for measuring the level and speed of data set shift.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aral, A., Brandic, I.: Dependency mining for service resilience at the edge. In: ACM/IEEE Symposium on Edge Computing, pp. 228–242. IEEE (2018) Aral, A., Brandic, I.: Dependency mining for service resilience at the edge. In: ACM/IEEE Symposium on Edge Computing, pp. 228–242. IEEE (2018)
2.
Zurück zum Zitat de Assuncao, M.D., da Silva Veith, A., Buyya, R.: Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103, 1–17 (2018)CrossRef de Assuncao, M.D., da Silva Veith, A., Buyya, R.: Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103, 1–17 (2018)CrossRef
3.
Zurück zum Zitat Ben-Haim, Y., Tom-Tov, E.: A streaming parallel decision tree algorithm. J. Mach. Learn. Res. 11, 849–872 (2010)MathSciNetMATH Ben-Haim, Y., Tom-Tov, E.: A streaming parallel decision tree algorithm. J. Mach. Learn. Res. 11, 849–872 (2010)MathSciNetMATH
4.
Zurück zum Zitat Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015)MATH Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015)MATH
5.
Zurück zum Zitat Brogi, A., Mencagli, G., Neri, D., Soldani, J., Torquati, M.: Container-based support for autonomic DSP through the Fog. In: Auto-DaSP, pp. 17–28 (2017) Brogi, A., Mencagli, G., Neri, D., Soldani, J., Torquati, M.: Container-based support for autonomic DSP through the Fog. In: Auto-DaSP, pp. 17–28 (2017)
6.
Zurück zum Zitat Cardellini, V., Presti, F.L., Nardelli, M., Russo, G.R.: Decentralized self-adaptation for elastic data stream processing. Future Gener. Comput. Syst. 87, 171–185 (2018)CrossRef Cardellini, V., Presti, F.L., Nardelli, M., Russo, G.R.: Decentralized self-adaptation for elastic data stream processing. Future Gener. Comput. Syst. 87, 171–185 (2018)CrossRef
7.
Zurück zum Zitat Cipar, J., Ho, Q., Kim, J.K., Lee, S., Ganger, G.R., Gibson, G., et al.: Solving the straggler problem with bounded staleness. In: HotOS, vol. 13, p. 22 (2013) Cipar, J., Ho, Q., Kim, J.K., Lee, S., Ganger, G.R., Gibson, G., et al.: Solving the straggler problem with bounded staleness. In: HotOS, vol. 13, p. 22 (2013)
8.
Zurück zum Zitat Erol-Kantarci, M., Mouftah, H.T.: Energy-efficient information and communication infrastructures in the smart grid: a survey on interactions and open issues. IEEE Commun. Surv. Tutor. 17(1), 179–197 (2015)CrossRef Erol-Kantarci, M., Mouftah, H.T.: Energy-efficient information and communication infrastructures in the smart grid: a survey on interactions and open issues. IEEE Commun. Surv. Tutor. 17(1), 179–197 (2015)CrossRef
10.
Zurück zum Zitat Greenberg, A., Hamilton, J., Maltz, D.A., Patel, P.: The cost of a cloud: research problems in DC networks. Comput. Commun. Rev. 39(1), 68–73 (2008)CrossRef Greenberg, A., Hamilton, J., Maltz, D.A., Patel, P.: The cost of a cloud: research problems in DC networks. Comput. Commun. Rev. 39(1), 68–73 (2008)CrossRef
11.
Zurück zum Zitat Hara, T., Madria, S.K.: Consistency management among replicas in peer-to-peer mobile ad hoc networks. In: 24th IEEE Symposium on Reliable Distributed Systems, pp. 3–12. IEEE (2005) Hara, T., Madria, S.K.: Consistency management among replicas in peer-to-peer mobile ad hoc networks. In: 24th IEEE Symposium on Reliable Distributed Systems, pp. 3–12. IEEE (2005)
12.
Zurück zum Zitat Harries, M.: SPLICE-2 Comparative Evaluation: Electricity Pricing. Technical report, The University of New South Wales, Sydney 2052, Australia (1999) Harries, M.: SPLICE-2 Comparative Evaluation: Electricity Pricing. Technical report, The University of New South Wales, Sydney 2052, Australia (1999)
13.
Zurück zum Zitat Ho, Q., Cipar, J., Cui, H., Lee, S., Kim, J.K., Gibbons, P.B., et al.: More effective distributed ML via a stale synchronous parallel parameter server. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2013) Ho, Q., Cipar, J., Cui, H., Lee, S., Kim, J.K., Gibbons, P.B., et al.: More effective distributed ML via a stale synchronous parallel parameter server. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2013)
14.
Zurück zum Zitat Javadi, B., Kondo, D., Vincent, J., Anderson, D.: Mining for statistical availability models in large-scale distributed systems: an empirical study of SETI@home. In: IEEE/ACM MASCOTS (2009) Javadi, B., Kondo, D., Vincent, J., Anderson, D.: Mining for statistical availability models in large-scale distributed systems: an empirical study of SETI@home. In: IEEE/ACM MASCOTS (2009)
15.
Zurück zum Zitat Kim, K.: Financial time series forecasting using support vector machines. Neurocomputing 55(1–2), 307–319 (2003)CrossRef Kim, K.: Financial time series forecasting using support vector machines. Neurocomputing 55(1–2), 307–319 (2003)CrossRef
16.
Zurück zum Zitat Lee, J.H., Sim, J., Kim, H.: BSSync: processing near memory for machine learning workloads with bounded staleness consistency models. In: International Conference on Parallel Architecture and Compilation, pp. 241–252. IEEE (2015) Lee, J.H., Sim, J., Kim, H.: BSSync: processing near memory for machine learning workloads with bounded staleness consistency models. In: International Conference on Parallel Architecture and Compilation, pp. 241–252. IEEE (2015)
17.
Zurück zum Zitat Li, M., Andersen, D.G., Park, J.W., Smola, A.J., Ahmed, A., Josifovski, V., et al.: Scaling distributed machine learning with the parameter server. In: USENIX Conference on Operating Systems Design and Implementation, pp. 583–598 (2014) Li, M., Andersen, D.G., Park, J.W., Smola, A.J., Ahmed, A., Josifovski, V., et al.: Scaling distributed machine learning with the parameter server. In: USENIX Conference on Operating Systems Design and Implementation, pp. 583–598 (2014)
18.
Zurück zum Zitat Lujic, I., De Maio, V., Brandic, I.: Efficient edge storage management based on near real-time forecasts. In: ICFEC, pp. 21–30. IEEE (2017) Lujic, I., De Maio, V., Brandic, I.: Efficient edge storage management based on near real-time forecasts. In: ICFEC, pp. 21–30. IEEE (2017)
19.
Zurück zum Zitat McDonald, J., McGranaghan, M., Denton, D., Ellis, A., Imhoff, C., et al.: Strategic R&D opportunities for the smart grid. Technical report, NIST Steering Committee for Innovation in Smart Grid Measurement Science and Standards (2013) McDonald, J., McGranaghan, M., Denton, D., Ellis, A., Imhoff, C., et al.: Strategic R&D opportunities for the smart grid. Technical report, NIST Steering Committee for Innovation in Smart Grid Measurement Science and Standards (2013)
20.
Zurück zum Zitat Melton, R., Knight, M., et al.: GridWise Transactive Energy Framework (version 1). Technical report, The GridWise Architecture Council, WA, USA, PNNL-22946 (2015) Melton, R., Knight, M., et al.: GridWise Transactive Energy Framework (version 1). Technical report, The GridWise Architecture Council, WA, USA, PNNL-22946 (2015)
21.
Zurück zum Zitat Morales, G.D.F., Bifet, A.: Samoa: scalable advanced massive online analysis. J. Mach. Learn. Res. 16(1), 149–153 (2015) Morales, G.D.F., Bifet, A.: Samoa: scalable advanced massive online analysis. J. Mach. Learn. Res. 16(1), 149–153 (2015)
22.
Zurück zum Zitat Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., et al.: A unifying view on dataset shift in classification. Pattern Recognit. 45(1), 521–530 (2012)CrossRef Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., et al.: A unifying view on dataset shift in classification. Pattern Recognit. 45(1), 521–530 (2012)CrossRef
24.
Zurück zum Zitat Patel, P., Ali, M.I., Sheth, A.: On using the intelligent edge for IoT analytics. IEEE Intell. Syst. 32(5), 64–69 (2017)CrossRef Patel, P., Ali, M.I., Sheth, A.: On using the intelligent edge for IoT analytics. IEEE Intell. Syst. 32(5), 64–69 (2017)CrossRef
25.
Zurück zum Zitat Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009) Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)
26.
Zurück zum Zitat Ranjan, R.: Streaming big data processing in datacenter clouds. IEEE Cloud Comput. 1(1), 78–83 (2014)CrossRef Ranjan, R.: Streaming big data processing in datacenter clouds. IEEE Cloud Comput. 1(1), 78–83 (2014)CrossRef
27.
Zurück zum Zitat Satyanarayanan, M., Bahl, P., Caceres, R., Davies, N.: The case for VM-based cloudlets in mobile computing. IEEE Pervasive Comput. 8(4), 14–23 (2009)CrossRef Satyanarayanan, M., Bahl, P., Caceres, R., Davies, N.: The case for VM-based cloudlets in mobile computing. IEEE Pervasive Comput. 8(4), 14–23 (2009)CrossRef
28.
Zurück zum Zitat Xing, E.P., Ho, Q., Dai, W., et al.: Petuum: a new platform for distributed machine learning on big data. IEEE Trans. Big Data 1(2), 49–67 (2015)CrossRef Xing, E.P., Ho, Q., Dai, W., et al.: Petuum: a new platform for distributed machine learning on big data. IEEE Trans. Big Data 1(2), 49–67 (2015)CrossRef
29.
Zurück zum Zitat Yu, H., Vahdat, A.: Design and evaluation of a conit-based continuous consistency model for replicated services. ACM TOCS 20(3), 239–282 (2002)CrossRef Yu, H., Vahdat, A.: Design and evaluation of a conit-based continuous consistency model for replicated services. ACM TOCS 20(3), 239–282 (2002)CrossRef
30.
Zurück zum Zitat Zeger, S.L., Qaqish, B.: Markov regression models for time series: a quasi-likelihood approach. Biometrics 44(4), 1019–1031 (1988)MathSciNetCrossRef Zeger, S.L., Qaqish, B.: Markov regression models for time series: a quasi-likelihood approach. Biometrics 44(4), 1019–1031 (1988)MathSciNetCrossRef
31.
Zurück zum Zitat Zhang, G.P.: Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50, 159–175 (2003)CrossRef Zhang, G.P.: Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50, 159–175 (2003)CrossRef
32.
Zurück zum Zitat Žliobaitė, I.: Learning under concept drift: an overview. Technical report, Vilnius University (2010). eprint arXiv:1010.4784 Žliobaitė, I.: Learning under concept drift: an overview. Technical report, Vilnius University (2010). eprint arXiv:​1010.​4784
Metadaten
Titel
Consistency of the Fittest: Towards Dynamic Staleness Control for Edge Data Analytics
verfasst von
Atakan Aral
Ivona Brandic
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-10549-5_4

Premium Partner