Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 6/2018

18-05-2018

A probabilistic stop and move classifier for noisy GPS trajectories

Authors: Luke Bermingham, Ickjai Lee

Published in: Data Mining and Knowledge Discovery | Issue 6/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Stop and move information can be used to uncover useful semantic patterns; therefore, annotating GPS trajectories as either stopping or moving is beneficial. However, the task of automatically discovering if the entity is stopping or moving is challenging due to the spatial noisiness of real-world GPS trajectories. Existing approaches classify each entry definitively as being either a stop or a move: hiding all indication that some classifications can be made with more certainty than others. Such an indication of the “goodness of classification” of each entry would allow the user to filter out certain stop classifications that appear too ambiguous for their use-case, which in a data-mining context may ultimately lead to less false patterns. In this work we propose such an approach that takes a noisy GPS trajectory as input and calculates the stop probability at each entry. Through the use of a minimum stop probability parameter our proposed approach allows the user to directly filter out any classified stops that are of an unacceptable probability for their application. Using several real-world and synthetic GPS trajectories (that we have made available) we compared the classification effectiveness, parameter sensitivity, and running time of our approach to two well-known existing approaches SMoT and CB-SMoT. Experimental results indicated the efficiency, effectiveness, and sampling rate robustness of our approach compared to the existing approaches. The results also demonstrated that the user can increase the minimum stop probability parameter to easily filter out low probability stop classifications—which equated to effectively reducing the number of false positive classifications in our ground truth experiments. Lastly, we proposed estimation heuristics for each our approaches’ parameters and empirically demonstrated the effectiveness of each heuristic using real-world trajectories. Specifically, the results revealed that even when all of the parameters were estimated the classification effectiveness of our approach was higher than existing approaches across a range of sampling rates.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
In the case where stops must occur for some minimum amount of time it is straightforward to enforce this constraint on POSMIT’s stop/move classification result. Firstly, all contiguous entries that are classified as stops are merged into groups, each of these groups then has their combined durations calculated, and finally groups whose durations are too low become moves.
 
2
Entries with spatial coordinates in a non-Cartesian geographic projection will need to be unprojected to calculate a suitable Euclidean distance. Also, Euclidean distance was chosen over great-circle distance for this problem because it is most widely used in spatial analysis (Smith et al. 2015), and it is faster to compute and intra-point distance between points in a candidate stop are intrinsically small; thus, factoring in the curvature of Earth in this case would be negligible.
 
Literature
go back to reference Alvares LO, Bogorny V, Kuijpers B, de Macedo J.A.F, Moelans B, Vaisman A (2007) A model for enriching trajectories with semantic geographical information. In: Proceedings of the 15th annual ACM international symposium on advances in geographic information systems GIS ’07. ACM, New York, pp 22:1–22:8 Alvares LO, Bogorny V, Kuijpers B, de Macedo J.A.F, Moelans B, Vaisman A (2007) A model for enriching trajectories with semantic geographical information. In: Proceedings of the 15th annual ACM international symposium on advances in geographic information systems GIS ’07. ACM, New York, pp 22:1–22:8
go back to reference Boukhechba M, Bouzouane A, Bouchard B, Gouin-Vallerand C, Giroux S (2015) Online recognition of people’s activities from raw GPS data: semantic trajectory data analysis. In: Proceedings of the 8th ACM international conference on PErvasive technologies related to assistive environments PETRA ’15. ACM, New York, pp 40:1–40:8. https://doi.org/10.1145/2769493.2769498 Boukhechba M, Bouzouane A, Bouchard B, Gouin-Vallerand C, Giroux S (2015) Online recognition of people’s activities from raw GPS data: semantic trajectory data analysis. In: Proceedings of the 8th ACM international conference on PErvasive technologies related to assistive environments PETRA ’15. ACM, New York, pp 40:1–40:8. https://​doi.​org/​10.​1145/​2769493.​2769498
go back to reference de Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles, techniques and software tools, 5th edn. The Winchelsea Press de Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles, techniques and software tools, 5th edn. The Winchelsea Press
go back to reference Ester M, peter Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD ’96: proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, pp 226–231 Ester M, peter Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD ’96: proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, pp 226–231
go back to reference Fischer MM, Getis A (eds) (2010) Handbook of applied spatial analysis: software tools, methods and applications. Springer Fischer MM, Getis A (eds) (2010) Handbook of applied spatial analysis: software tools, methods and applications. Springer
go back to reference Gonzalez MC, Hidalgo CA, Barabasi AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782CrossRef Gonzalez MC, Hidalgo CA, Barabasi AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782CrossRef
go back to reference Guidotti R, Trasarti R, Nanni M (2015) Tosca: two-steps clustering algorithm for personal locations detection. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems SIGSPATIAL ’15. ACM, New York, pp 38:1–38:10 Guidotti R, Trasarti R, Nanni M (2015) Tosca: two-steps clustering algorithm for personal locations detection. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems SIGSPATIAL ’15. ACM, New York, pp 38:1–38:10
go back to reference Guidotti R, Trasarti R, Nanni M, Giannotti F, Pedreschi D (2017) There’s a path for everyone: a data-driven personal model reproducing mobility agendas. In: 2017 IEEE international conference on data science and advanced analytics (DSAA) pp 303–312 Guidotti R, Trasarti R, Nanni M, Giannotti F, Pedreschi D (2017) There’s a path for everyone: a data-driven personal model reproducing mobility agendas. In: 2017 IEEE international conference on data science and advanced analytics (DSAA) pp 303–312
go back to reference Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New YorkCrossRefMATH Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New YorkCrossRefMATH
go back to reference Huang L, Li Q, Yue Y (2010) Activity identification from GPS trajectories using spatial temporal POIS’ attractiveness. In: Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks LBSN ’10. ACM, New York, pp 27–30. https://doi.org/10.1145/1867699.1867704 Huang L, Li Q, Yue Y (2010) Activity identification from GPS trajectories using spatial temporal POIS’ attractiveness. In: Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks LBSN ’10. ACM, New York, pp 27–30. https://​doi.​org/​10.​1145/​1867699.​1867704
go back to reference Hwang YC, Lin CC, Chang JR, Mori H, Huang HC (2009) Predicting essential genes based on network and sequence analysis. Mol BioSyst 5:1672–1678CrossRef Hwang YC, Lin CC, Chang JR, Mori H, Huang HC (2009) Predicting essential genes based on network and sequence analysis. Mol BioSyst 5:1672–1678CrossRef
go back to reference Khetarpaul S, Chauhan R, Gupta SK, Subramaniam LV, Nambiar U (2011) Mining GPS data to determine interesting locations. In: Proceedings of the 8th international workshop on information integration on the Web: In Conjunction with WWW 2011 IIWeb ’11. ACM, New York, pp 8:1–8:6. https://doi.org/10.1145/1982624.1982632 Khetarpaul S, Chauhan R, Gupta SK, Subramaniam LV, Nambiar U (2011) Mining GPS data to determine interesting locations. In: Proceedings of the 8th international workshop on information integration on the Web: In Conjunction with WWW 2011 IIWeb ’11. ACM, New York, pp 8:1–8:6. https://​doi.​org/​10.​1145/​1982624.​1982632
go back to reference Leung KWT, Lee DL, Lee WC (2011) Clr: a collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval SIGIR ’11. ACM, New York, pp 305–314. https://doi.org/10.1145/2009916.2009960 Leung KWT, Lee DL, Lee WC (2011) Clr: a collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval SIGIR ’11. ACM, New York, pp 305–314. https://​doi.​org/​10.​1145/​2009916.​2009960
go back to reference McCarroll D (2017) Simple statistical tests for geography. CRC Press, Boca Raton McCarroll D (2017) Simple statistical tests for geography. CRC Press, Boca Raton
go back to reference Powers D (2011) Evaluation: from precision recall and f-measure to ROC informedness markedness & correlation. J Mach Learn Technol 2:37–63 Powers D (2011) Evaluation: from precision recall and f-measure to ROC informedness markedness & correlation. J Mach Learn Technol 2:37–63
go back to reference Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: Proceedings of the 2011 31st international conference on distributed computing systems workshops ICDCSW ’11. IEEE Computer Society, Washington, pp 166–171. https://doi.org/10.1109/ICDCSW.2011.20 Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: Proceedings of the 2011 31st international conference on distributed computing systems workshops ICDCSW ’11. IEEE Computer Society, Washington, pp 166–171. https://​doi.​org/​10.​1109/​ICDCSW.​2011.​20
go back to reference Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles techniques and software tools, 5th edn. The Winchelsea Press, Leicester Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles techniques and software tools, 5th edn. The Winchelsea Press, Leicester
go back to reference Spinsanti L, Celli F, Renso C (2010) Where you stop is who you are: understanding peoples activities by places visited. In: BMI ’10: Proceedings of the 5th BMI workshop on behaviour monitoring and interpretation. CEUR-WS Karlsruhe, Germany, pp 38–52 Spinsanti L, Celli F, Renso C (2010) Where you stop is who you are: understanding peoples activities by places visited. In: BMI ’10: Proceedings of the 5th BMI workshop on behaviour monitoring and interpretation. CEUR-WS Karlsruhe, Germany, pp 38–52
go back to reference Thierry B, Chaix B, Kestens Y (2013) Detecting activity locations from raw gps data: a novel kernel-based algorithm. Int J Health Geogr 12(1):14CrossRef Thierry B, Chaix B, Kestens Y (2013) Detecting activity locations from raw gps data: a novel kernel-based algorithm. Int J Health Geogr 12(1):14CrossRef
go back to reference Tobler WR (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46:234–240CrossRef Tobler WR (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46:234–240CrossRef
go back to reference Tran LH, Nguyen QVH, Do NH, Yan Z (2011) Robust and hierarchical stop discovery in sparse and diverse trajectories. Technical report EPFL EPFL Tran LH, Nguyen QVH, Do NH, Yan Z (2011) Robust and hierarchical stop discovery in sparse and diverse trajectories. Technical report EPFL EPFL
go back to reference Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems GIS ’10. ACM, New York, pp 99–108. https://doi.org/10.1145/1869790.1869807 Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems GIS ’10. ACM, New York, pp 99–108. https://​doi.​org/​10.​1145/​1869790.​1869807
Metadata
Title
A probabilistic stop and move classifier for noisy GPS trajectories
Authors
Luke Bermingham
Ickjai Lee
Publication date
18-05-2018
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 6/2018
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-018-0568-8

Other articles of this Issue 6/2018

Data Mining and Knowledge Discovery 6/2018 Go to the issue

Premium Partner