Skip to main content
Top
Published in: The Journal of Supercomputing 7/2016

01-07-2016

Highway traffic accident prediction using VDS big data analysis

Authors: Seong-hun Park, Sung-min Kim, Young-guk Ha

Published in: The Journal of Supercomputing | Issue 7/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In modern society, accidents on the roads are one of the most life-threatening dangers to humans. Traffic accidents that cause a lot of damages are occurring all over the places. The most effective solution to these types of accidents can be to predict future accidents in advance, giving drivers chances to avoid the dangers or reduce the damage by responding quickly. Predicting accidents on the road can be achieved using classification analysis, a data mining procedure requiring enough data to build a learning model. However, building such a predicting system involves several problems. It requires many hardware resources to collect and analyze traffic data for predicting traffic accidents since the data are extremely large. Furthermore, the size of data related to traffic accidents is less than that not related to traffic accidents; the amounts of the two classes (classes to be predicted and other classes) of data differ and are thus imbalanced. The purpose of this paper is to build a predicting model that can resolve all these problems. This paper suggests using the Hadoop framework to process and analyze big traffic data efficiently and a sampling method to resolve the problem of data imbalance. Based on this, the predicting system first preprocesses the big traffic data and analyzes it to create data for the learning system. The imbalance of created data is corrected using a sampling method. To improve the predicting accuracy, corrected data are classified into several groups, to which classification analysis is applied.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Lv Y, Tang S, Zhao H (2009) Real-time highway traffic accident prediction based on the \(k\)-nearest neighbor method. In: International conference on measuring technology and mechatronics automation (ICMTMA), vol 3, pp 547–550 Lv Y, Tang S, Zhao H (2009) Real-time highway traffic accident prediction based on the \(k\)-nearest neighbor method. In: International conference on measuring technology and mechatronics automation (ICMTMA), vol 3, pp 547–550
2.
go back to reference Yu R, Liu X (2010) Study on traffic accidents prediction model based on RBF neural network. In: 2nd international conference on information engineering and computer science (ICIECS), pp 1–4 Yu R, Liu X (2010) Study on traffic accidents prediction model based on RBF neural network. In: 2nd international conference on information engineering and computer science (ICIECS), pp 1–4
3.
go back to reference Lv Y, Tang S, Zhao H (2010) Research on influence extention of two-lane highway intersections based on traffic accident database. In: International conference on optoelectronics and image processing (ICOIP), vol 2, pp 244–246 Lv Y, Tang S, Zhao H (2010) Research on influence extention of two-lane highway intersections based on traffic accident database. In: International conference on optoelectronics and image processing (ICOIP), vol 2, pp 244–246
4.
go back to reference Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto K-I (2007) The effects of over and under sampling on fault-prone module detection. In: Empirical software engineering and measurement, pp 196–204 Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto K-I (2007) The effects of over and under sampling on fault-prone module detection. In: Empirical software engineering and measurement, pp 196–204
5.
go back to reference Gothenberg A, Tenhunen H (1998) Performance analysis of low oversampling ratio sigma-delta noise shapers for RF applications. In: Proceedings of the 1998 IEEE international symposium on circuits and systems (ISCAS’98), vol 1, pp 401–404 Gothenberg A, Tenhunen H (1998) Performance analysis of low oversampling ratio sigma-delta noise shapers for RF applications. In: Proceedings of the 1998 IEEE international symposium on circuits and systems (ISCAS’98), vol 1, pp 401–404
6.
go back to reference Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51:107–113CrossRef Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51:107–113CrossRef
7.
go back to reference Lee T, Kim H, Rhee K-H, Shin S-U (2013) Implementation and performance of distributed text processing system using Hadoop for e-discovery cloud service. Innov Inf Sci Technol Res Group (ISYOU) 4:12–24 Lee T, Kim H, Rhee K-H, Shin S-U (2013) Implementation and performance of distributed text processing system using Hadoop for e-discovery cloud service. Innov Inf Sci Technol Res Group (ISYOU) 4:12–24
8.
go back to reference Zhang F, Sakr M (2013) Dataset scaling and MapReduce performance. In: 2013 IEEE 27th international on parallel and distributed processing symposium workshops and PhD forum (IPDPSW), pp 1683–1690 Zhang F, Sakr M (2013) Dataset scaling and MapReduce performance. In: 2013 IEEE 27th international on parallel and distributed processing symposium workshops and PhD forum (IPDPSW), pp 1683–1690
10.
go back to reference Chen T-S, Hu X-Q, Li S-A, Zhou C-L (2008) Multi-class diagnosis classification on high dimension data by logistic models. In: 2008 international conference on machine learning and cybernetics, vol 6, pp 3301–3306 Chen T-S, Hu X-Q, Li S-A, Zhou C-L (2008) Multi-class diagnosis classification on high dimension data by logistic models. In: 2008 international conference on machine learning and cybernetics, vol 6, pp 3301–3306
11.
go back to reference Seliya N, Xu Z, Khoshgoftaar TM (2008) Addressing class imbalance in non-binary classification problems. In: 20th IEEE international conference on tools with artificial intelligence (ICTAI’08), vol 1, pp 460–466 Seliya N, Xu Z, Khoshgoftaar TM (2008) Addressing class imbalance in non-binary classification problems. In: 20th IEEE international conference on tools with artificial intelligence (ICTAI’08), vol 1, pp 460–466
12.
go back to reference Maithani S, Tyagi R (2008) Noise characterization and classification for background estimation. In: International conference on signal processing, communications and networking (ICSCN’08), pp 208–213 Maithani S, Tyagi R (2008) Noise characterization and classification for background estimation. In: International conference on signal processing, communications and networking (ICSCN’08), pp 208–213
13.
go back to reference Yan Z, Wang X, Du L (2011) Design method of highway traffic safety analysis model. In: International conference on transportation, mechanical, and electrical engineering (TMEE), pp 151–154 Yan Z, Wang X, Du L (2011) Design method of highway traffic safety analysis model. In: International conference on transportation, mechanical, and electrical engineering (TMEE), pp 151–154
14.
go back to reference Beshah T, Ejigu D, Abraham A, Snasel V, Kromer P (2011) Pattern recognition and knowledge discovery from road traffic accident data in Ethiopia: implications for improving road safety. In: 2011 world congress on information and communication technologies (WICT), pp 1241–1246 Beshah T, Ejigu D, Abraham A, Snasel V, Kromer P (2011) Pattern recognition and knowledge discovery from road traffic accident data in Ethiopia: implications for improving road safety. In: 2011 world congress on information and communication technologies (WICT), pp 1241–1246
15.
go back to reference Ramani RG, Shanthi S (2012) Classifier prediction evaluation in modeling road traffic accident data. In: 2012 IEEE international conference on computational intelligence and computing research (ICCIC), pp 1–4 Ramani RG, Shanthi S (2012) Classifier prediction evaluation in modeling road traffic accident data. In: 2012 IEEE international conference on computational intelligence and computing research (ICCIC), pp 1–4
16.
go back to reference Ghimire B, Bhattacharjee S, Ghosh SK (2013) Analysis of spatial autocorrelation for traffic accident data based on spatial decision tree. In: 2013 fourth international conference on computing for geospatial research and application (COM.Geo), pp 111–115 Ghimire B, Bhattacharjee S, Ghosh SK (2013) Analysis of spatial autocorrelation for traffic accident data based on spatial decision tree. In: 2013 fourth international conference on computing for geospatial research and application (COM.Geo), pp 111–115
20.
go back to reference Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357MATH Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357MATH
21.
go back to reference Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH
22.
go back to reference Kukar M (2004) Transduction and typicalness for quality assessment of individual classifications in machine learning and data mining. In: Fourth IEEE international conference on data mining (ICDM’04), pp 146–153 Kukar M (2004) Transduction and typicalness for quality assessment of individual classifications in machine learning and data mining. In: Fourth IEEE international conference on data mining (ICDM’04), pp 146–153
23.
go back to reference Raghavendra PS, Chowdhury SR, Kameswari SV (2010) Comparative study of neural networks and \(k\)-means classification in web usage mining. In: International conference on internet technology and secured transactions (ICITST) Raghavendra PS, Chowdhury SR, Kameswari SV (2010) Comparative study of neural networks and \(k\)-means classification in web usage mining. In: International conference on internet technology and secured transactions (ICITST)
24.
go back to reference Rahayu SP, Purnami SW, Embong A (2008) Applying kernel logistic regression in data mining to classify credit risk. Inf Technol 2:1–6 Rahayu SP, Purnami SW, Embong A (2008) Applying kernel logistic regression in data mining to classify credit risk. Inf Technol 2:1–6
25.
go back to reference Mountassir A, Benbrahim H, Berrada I (2010) An empirical study to address the problem of unbalanced data sets in sentiment classification. In: 2012 IEEE international conference on systems, man, and cybernetics (SMC), pp 3298–3303 Mountassir A, Benbrahim H, Berrada I (2010) An empirical study to address the problem of unbalanced data sets in sentiment classification. In: 2012 IEEE international conference on systems, man, and cybernetics (SMC), pp 3298–3303
Metadata
Title
Highway traffic accident prediction using VDS big data analysis
Authors
Seong-hun Park
Sung-min Kim
Young-guk Ha
Publication date
01-07-2016
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 7/2016
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1624-z

Other articles of this Issue 7/2016

The Journal of Supercomputing 7/2016 Go to the issue

Premium Partner