skip to main content
10.1145/3297280.3297342acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A transfer learning paradigm for spatial networks

Published:08 April 2019Publication History

ABSTRACT

Advances in machine learning and the availability of spatial data have seen remarkable improvements in recent times. This parallel growth has influenced the increased application of traditional data mining techniques for knowledge discovery on spatial data. However, these techniques assume that the data is drawn from an independent and identical distribution whereas spatial data is inherently dependent and heterogeneous. This contradiction strongly suggests that a crass application of conventional data mining techniques to spatial data would be suboptimal. In this paper, we evaluate the relatedness of street networks using a transfer learning methodology within the formal contexts of spatial data. Adopting a statistical multi-measure, we analyze street networks from eight cities in an attempt to ascertain their similarities. We predict the street types using random forests and evaluate the accuracies as a function of transfer polarity. Positive transfer is when the transferred models perform better than the parent model or negative transfer when it is worse. With an overall average accuracy of 85%, our results show that it is possible to generalize machine learning models onto different domains and still produce excellent results. Also, we demonstrate that the improved or loss of model accuracy can be explained by the proportion of statistical similarity between the domains. This observation confirms that a measure of inter-domain similarity solely based on geo-political boundaries will be erroneous. The techniques we have described are a statistically sound foundation for analysis of similarities in the spatial context. It can be adopted towards understanding the extent of model generalization for spatial networks.

References

  1. F Heinzle, KH Anders, and M Sester. Automatic Detection of Patterns in Road Networks - Methods and Evaluation. In Proc. of Joint Workshop Visualization and Exploration of Geospatial Data, Stuttgart, volume 36, page 4, 2007.Google ScholarGoogle Scholar
  2. Nahid Mohajeri and Agust Gudmundsson. The Evolution and Complexity of Urban Street Networks. Geographical Analysis, 46(4):345--367, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. Shashi Shekhar, Michael R Evans, James M Kang, and Pradeep Mohan. Identifying Patterns in Spatial Information: A Survey of Methods. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):193--214, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  4. Colin R Blyth. On simpson's Paradox and the Sure-thing Principle. Journal of the American Statistical Association, 67(338):364--366, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jing Gao, Wei Fan, Jing Jiang, and Jiawei Han. Knowledge Transfer via Multiple Model Local Structure Mapping. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 283--291. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. A Survey of Transfer Learning. Journal of Big Data, 3(1):9, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  7. Andrea Ballatore and Michela Bertolotto. Semantically Enriching VGI in Support of Implicit Feedback Analysis. In International Symposium on Web and Wireless Geographical Information Systems, pages 78--93. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Rob Kitchin and Gavin McArdle. What makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society, 3(1):2053951716631130, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  9. Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald, and Edin Muharemagic. Deep Learning Applications and Challenges in Big Data Analytics. Journal of Big Data, 2(1):1, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  10. Mahmuda Ahmed, Sophia Karagiorgou, Dieter Pfoser, and Carola Wenk. A Comparison and Evaluation of Map Construction Algorithms using Vehicle Tracking Data. GeoInformatica, 19(3):601--632, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jaeeun Lee, Hanme Jang, Jonghyeon Yang, and Kiyun Yu. Machine Learning Classification of Buildings for Map Generalization. ISPRS International Journal of Geo-Information, 6(10):309, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  12. Rodolphe Devillers, Alfred Stein, Yvan Bédard, Nicholas Chrisman, Peter Fisher, and Wenzhong Shi. Thirty years of Research on Spatial Data Quality: Achievements, Failures, and Opportunities. Transactions in GIS, 14(4):387--400, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  13. Harvey J Miller. Tobler's First Law and Spatial Analysis. Annals of the Association of American Geographers, 94(2):284--289, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  14. Hidetoshi Shimodaira. Improving Predictive Inference under Covariate Shift by Weighting the Log-Likelihood Function. Journal of statistical planning and inference, 90(2):227--244, 2000.Google ScholarGoogle Scholar
  15. Liang Ge, Jing Gao, Hung Ngo, Kang Li, and Aidong Zhang. On Handling Negative Transfer and Imbalanced Distributions in Multiple Source Transfer Learning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4):254--271, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Geoff Boeing. Osmnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks. Computers, Environment and Urban Systems, 65:126--139, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  17. Mordechai Haklay. How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environment and planning B: Planning and design, 37(4):682--703, 2010.Google ScholarGoogle Scholar
  18. Silvana Philippi Camboim, João Vitor Meza Bravo, and Claudia Robbi Sluter. An Investigation into the Completeness of, and the Updates to, OpenStreetMap Data in a Heterogeneous Area in Brazil. ISPRS International Journal of Geo-Information, 4(3):1366--1388, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  19. OpenStreetMap. Highways. https://wiki.openstreetmap.org/wiki/Highways, 2018.Google ScholarGoogle Scholar
  20. Geoff Boeing. Urban Spatial Order: Street Network Orientation, Configuration, and Entropy. 2018.Google ScholarGoogle Scholar
  21. Noam Segev, Maayan Harel, Shie Mannor, Koby Crammer, and Ran El-Yaniv. Learn on Source, Refine on Target: A Model Transfer Learning Framework with Random Forests. IEEE transactions on pattern analysis and machine intelligence, 39(9):1811--1824, 2017.Google ScholarGoogle Scholar
  22. Thomas G Dietterich. Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems, pages 1--15. Springer, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. David Opitz and Richard Maclin. Popular Ensemble Methods: An Empirical Study. Journal of artificial intelligence research, 11:169--198, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Leo Breiman. Random Forests. Machine learning, 45(1):5--32, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Simon Scheider, Frank O Ostermann, and Benjamin Adams. Why good data analysts need to be critical synthesists. Determining the role of semantics in data analysis. Future generation computer systems, 72:11--22, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hansi Senaratne, Amin Mobasheri, Ahmed Loai Ali, Cristina Capineri, and Mordechai Haklay. A Review of Volunteered Geographic Information Quality Assessment Methods. International Journal of Geographical Information Science, 31(1):139--167, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A transfer learning paradigm for spatial networks
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
            April 2019
            2682 pages
            ISBN:9781450359337
            DOI:10.1145/3297280

            Copyright © 2019 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 April 2019

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,650of6,669submissions,25%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader