Skip to main content

2015 | OriginalPaper | Buchkapitel

Enabling Vehicular Data with Distributed Machine Learning

verfasst von : Cristian Chilipirea, Andreea Petre, Ciprian Dobre, Florin Pop, Fatos Xhafa

Erschienen in: Transactions on Computational Collective Intelligence XIX

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Vehicular Data includes different facts and measurements made over a set of moving vehicles. Most of us use cars or public transportation for our work commute, daily routines and leisure. But, except of our destination, possible time of arrival and what is directly around us, we know very little about the traffic conditions in the city as a whole. Because all roads are connected in a vast network, events in other parts of town can and will directly affect us. The more we know about the traffic inside a city, the better decisions we can make. Vehicular measurements may contain a vast amount of information about the way our cities function. Information that can be used for more than improving our commute, it is indicative of other features of the city like the amount of pollution in different regions. All the information and knowledge we can extract, can be used to directly improve our life.
We live in a world where data is constantly generated and we store it and process it at an ever growing rate. Vehicular Data does not stray from this fact and is rapidly growing in size and complexity, with more and more ways to monitoring traffic, either from inside cars or from sensors placed on the road. Smartphones and in-car-computers are now common and they can produce a vast amount of data: it can identify a cars location, destination, current speed and even driving habits.
Machine learning is the perfect complement for Big Data, as large data sets can be rendered useless without methods to extract knowledge and information from them. Machine learning, currently a popular research topic, has a large number of algorithms design to achieve this task, of knowledge extraction. Most of these techniques and algorithms can be directly applied to Vehicular Data.
In this article we demonstrate how the use of a simple algorithm, k-Nearest Neighbors, can be used to extract valuable information from even a relatively small vehicular data set. Because of the vast size of most of our cities and the number of cars that are on their roads at any time of the day, standard machine learning systems do not manage to process data in a manner that would permit real time use of the extracted information. A solution to this problem is brought by distributed systems and cloud processing. By parallelizing and distributing machine learning algorithms we can use data at its highest potential and with little delay. Here, we show how this can be achieved by distributing the k-Nearest Neighbors machine learning algorithm over MPI. We hope this would motivate the research into other combinations of merging machine learning algorithms with Vehicular Data sets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abdulhai, B., Pringle, R., Karakoulas, G.J.: Reinforcement learning for true adaptive traffic signal control. J. Transp. Eng. 129(3), 278–285 (2003)CrossRef Abdulhai, B., Pringle, R., Karakoulas, G.J.: Reinforcement learning for true adaptive traffic signal control. J. Transp. Eng. 129(3), 278–285 (2003)CrossRef
2.
Zurück zum Zitat Amici, R., Bonola, M., Bracciale, L., Rabuffi, A., Loreti, P., Bianchi, G.: Performance assessment of an epidemic protocol in vanet using real traces. Procedia Comput. Sci. 40, 92–99 (2014)CrossRef Amici, R., Bonola, M., Bracciale, L., Rabuffi, A., Loreti, P., Bianchi, G.: Performance assessment of an epidemic protocol in vanet using real traces. Procedia Comput. Sci. 40, 92–99 (2014)CrossRef
5.
Zurück zum Zitat Barrientos, R., Gómez, J., Tenllado, C., Prieto, M.: Heap based k-nearest neighbor search on gpus. In: Congreso Espanol de Informática (CEDI), pp. 559–566 (2010) Barrientos, R., Gómez, J., Tenllado, C., Prieto, M.: Heap based k-nearest neighbor search on gpus. In: Congreso Espanol de Informática (CEDI), pp. 559–566 (2010)
6.
Zurück zum Zitat Biem, A., Bouillet, E., Feng, H., Ranganathan, A., Riabov, A., Verscheure, O., Koutsopoulos, H., Moran, C.: Ibm infosphere streams for scalable, real-time, intelligent transportation services. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 1093–1104. ACM (2010) Biem, A., Bouillet, E., Feng, H., Ranganathan, A., Riabov, A., Verscheure, O., Koutsopoulos, H., Moran, C.: Ibm infosphere streams for scalable, real-time, intelligent transportation services. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 1093–1104. ACM (2010)
8.
Zurück zum Zitat Brakatsoulas, S., Pfoser, D., Salas, R., Wenk, C.: On map-matching vehicle tracking data. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 853–864. VLDB Endowment (2005) Brakatsoulas, S., Pfoser, D., Salas, R., Wenk, C.: On map-matching vehicle tracking data. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 853–864. VLDB Endowment (2005)
9.
Zurück zum Zitat Chadil, N., Russameesawang, A., Keeratiwintakorn, P.: Real-time tracking management system using gps, gprs and google earth. In: 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, ECTI-CON 2008, vol. 1, pp. 393–396. IEEE (2008) Chadil, N., Russameesawang, A., Keeratiwintakorn, P.: Real-time tracking management system using gps, gprs and google earth. In: 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, ECTI-CON 2008, vol. 1, pp. 393–396. IEEE (2008)
10.
Zurück zum Zitat Chen, W.Y., Song, Y., Bai, H., Lin, C.J., Chang, E.Y.: Parallel spectral clustering in distributed systems. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 568–586 (2011)CrossRef Chen, W.Y., Song, Y., Bai, H., Lin, C.J., Chang, E.Y.: Parallel spectral clustering in distributed systems. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 568–586 (2011)CrossRef
11.
Zurück zum Zitat Coifman, B., Beymer, D., McLauchlan, P., Malik, J.: A real-time computer vision system for vehicle tracking and traffic surveillance. Transp. Res. Part C: Emerg. Technol. 6(4), 271–288 (1998)CrossRef Coifman, B., Beymer, D., McLauchlan, P., Malik, J.: A real-time computer vision system for vehicle tracking and traffic surveillance. Transp. Res. Part C: Emerg. Technol. 6(4), 271–288 (1998)CrossRef
14.
Zurück zum Zitat Feldman, D., Sugaya, A., Sung, C., Rus, D.: idiary: From gps signals to a text-searchable diary. In: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems, p. 6. ACM (2013) Feldman, D., Sugaya, A., Sung, C., Rus, D.: idiary: From gps signals to a text-searchable diary. In: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems, p. 6. ACM (2013)
15.
Zurück zum Zitat Fu, L., Hao, J., He, D., He, K., Li, P.: Assessment of vehicular pollution in china. J. Air Waste Manage. Assoc. 51(5), 658–668 (2001)CrossRef Fu, L., Hao, J., He, D., He, K., Li, P.: Assessment of vehicular pollution in china. J. Air Waste Manage. Assoc. 51(5), 658–668 (2001)CrossRef
16.
Zurück zum Zitat Gillick, D., Faria, A., DeNero, J.: Mapreduce: Distributed computing for machine learning. Berkley, 18 Dec 2006 Gillick, D., Faria, A., DeNero, J.: Mapreduce: Distributed computing for machine learning. Berkley, 18 Dec 2006
17.
Zurück zum Zitat Hsieh, J.W., Yu, S.H., Chen, Y.S., Hu, W.F.: Automatic traffic surveillance system for vehicle tracking and classification. IEEE Trans. Intell. Transp. Syst. 7(2), 175–187 (2006)CrossRefMATH Hsieh, J.W., Yu, S.H., Chen, Y.S., Hu, W.F.: Automatic traffic surveillance system for vehicle tracking and classification. IEEE Trans. Intell. Transp. Syst. 7(2), 175–187 (2006)CrossRefMATH
19.
Zurück zum Zitat Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endowment 5(8), 716–727 (2012)CrossRef Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endowment 5(8), 716–727 (2012)CrossRef
20.
Zurück zum Zitat Lu, W., Shen, Y., Chen, S., Ooi, B.C.: Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endowment 5(10), 1016–1027 (2012)CrossRef Lu, W., Shen, Y., Chen, S., Ooi, B.C.: Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endowment 5(10), 1016–1027 (2012)CrossRef
21.
Zurück zum Zitat Mavromoustakis, C.X., Kormentzas, G., Mastorakis, G., Bourdena, A., Pallis, E., Rodrigues, J.: Context-oriented opportunistic cloud offload processing for energy conservation in wireless devices. In: Globecom Workshops (GC Wkshps), pp. 24–30. IEEE (2014) Mavromoustakis, C.X., Kormentzas, G., Mastorakis, G., Bourdena, A., Pallis, E., Rodrigues, J.: Context-oriented opportunistic cloud offload processing for energy conservation in wireless devices. In: Globecom Workshops (GC Wkshps), pp. 24–30. IEEE (2014)
23.
Zurück zum Zitat Mousicou, P., Mavromoustakis, C.X., Bourdena, A., Mastorakis, G., Pallis, E.: Performance evaluation of dynamic cloud resource migration based on temporal and capacity-aware policy for efficient resource sharing. In: Proceedings of the 2nd ACM Workshop on High Performance Mobile Opportunistic Systems, pp. 59–66. ACM (2013) Mousicou, P., Mavromoustakis, C.X., Bourdena, A., Mastorakis, G., Pallis, E.: Performance evaluation of dynamic cloud resource migration based on temporal and capacity-aware policy for efficient resource sharing. In: Proceedings of the 2nd ACM Workshop on High Performance Mobile Opportunistic Systems, pp. 59–66. ACM (2013)
25.
Zurück zum Zitat Papadakis, S.E., Stykas, V., Mastorakis, G., Mavromoustakis, C.X., et al.: A hyper-box approach using relational databases for large scale machine learning. In: 2014 International Conference on Telecommunications and Multimedia (TEMU), pp. 69–73. IEEE (2014) Papadakis, S.E., Stykas, V., Mastorakis, G., Mavromoustakis, C.X., et al.: A hyper-box approach using relational databases for large scale machine learning. In: 2014 International Conference on Telecommunications and Multimedia (TEMU), pp. 69–73. IEEE (2014)
26.
Zurück zum Zitat Pau, G., Tse, R.: Challenges and opportunities in immersive vehicular sensing: lessons from urban deployments. Sig. Process. Image Commun. 27(8), 900–908 (2012)CrossRef Pau, G., Tse, R.: Challenges and opportunities in immersive vehicular sensing: lessons from urban deployments. Sig. Process. Image Commun. 27(8), 900–908 (2012)CrossRef
27.
Zurück zum Zitat Piórkowski, M., Sarafijanovic-Djukic, N., Grossglauser, M.: A parsimonious model of mobile partitioned networks with clustering. In: First International Communication Systems and Networks and Workshops, COMSNETS 2009, pp. 1–10. IEEE (2009) Piórkowski, M., Sarafijanovic-Djukic, N., Grossglauser, M.: A parsimonious model of mobile partitioned networks with clustering. In: First International Communication Systems and Networks and Workshops, COMSNETS 2009, pp. 1–10. IEEE (2009)
28.
Zurück zum Zitat Safar, M.: K nearest neighbor search in navigation systems. Mob. Inf. Syst. 1(3), 207–224 (2005) Safar, M.: K nearest neighbor search in navigation systems. Mob. Inf. Syst. 1(3), 207–224 (2005)
29.
Zurück zum Zitat Schubert, R., Richter, E., Wanielik, G.: Comparison and evaluation of advanced motion models for vehicle tracking. In: 2008 11th International Conference on Information Fusion, pp. 1–6. IEEE (2008) Schubert, R., Richter, E., Wanielik, G.: Comparison and evaluation of advanced motion models for vehicle tracking. In: 2008 11th International Conference on Information Fusion, pp. 1–6. IEEE (2008)
30.
Zurück zum Zitat Sun, S., Zhang, C., Yu, G.: A bayesian network approach to traffic flow forecasting. IEEE Trans. Intell. Transp. Syst. 7(1), 124–132 (2006)CrossRef Sun, S., Zhang, C., Yu, G.: A bayesian network approach to traffic flow forecasting. IEEE Trans. Intell. Transp. Syst. 7(1), 124–132 (2006)CrossRef
31.
Zurück zum Zitat Williams, B.M., Hoel, L.A.: Modeling and forecasting vehicular traffic flow as a seasonal arima process: Theoretical basis and empirical results. J. Transp. Eng. 129(6), 664–672 (2003)CrossRef Williams, B.M., Hoel, L.A.: Modeling and forecasting vehicular traffic flow as a seasonal arima process: Theoretical basis and empirical results. J. Transp. Eng. 129(6), 664–672 (2003)CrossRef
32.
Zurück zum Zitat Zhang, C., Li, F., Jestes, J.: Efficient parallel knn joins for large data in mapreduce. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 38–49. ACM (2012) Zhang, C., Li, F., Jestes, J.: Efficient parallel knn joins for large data in mapreduce. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 38–49. ACM (2012)
33.
Zurück zum Zitat Zhu, H., Zhu, Y., Li, M., Ni, L.M.: Hero: Online real-time vehicle tracking in shanghai. In: IEEE The 27th Conference on Computer Communications, INFOCOM 2008. IEEE (2008) Zhu, H., Zhu, Y., Li, M., Ni, L.M.: Hero: Online real-time vehicle tracking in shanghai. In: IEEE The 27th Conference on Computer Communications, INFOCOM 2008. IEEE (2008)
Metadaten
Titel
Enabling Vehicular Data with Distributed Machine Learning
verfasst von
Cristian Chilipirea
Andreea Petre
Ciprian Dobre
Florin Pop
Fatos Xhafa
Copyright-Jahr
2015
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49017-4_6