ABSTRACT
Advances in machine learning and the availability of spatial data have seen remarkable improvements in recent times. This parallel growth has influenced the increased application of traditional data mining techniques for knowledge discovery on spatial data. However, these techniques assume that the data is drawn from an independent and identical distribution whereas spatial data is inherently dependent and heterogeneous. This contradiction strongly suggests that a crass application of conventional data mining techniques to spatial data would be suboptimal. In this paper, we evaluate the relatedness of street networks using a transfer learning methodology within the formal contexts of spatial data. Adopting a statistical multi-measure, we analyze street networks from eight cities in an attempt to ascertain their similarities. We predict the street types using random forests and evaluate the accuracies as a function of transfer polarity. Positive transfer is when the transferred models perform better than the parent model or negative transfer when it is worse. With an overall average accuracy of 85%, our results show that it is possible to generalize machine learning models onto different domains and still produce excellent results. Also, we demonstrate that the improved or loss of model accuracy can be explained by the proportion of statistical similarity between the domains. This observation confirms that a measure of inter-domain similarity solely based on geo-political boundaries will be erroneous. The techniques we have described are a statistically sound foundation for analysis of similarities in the spatial context. It can be adopted towards understanding the extent of model generalization for spatial networks.
- F Heinzle, KH Anders, and M Sester. Automatic Detection of Patterns in Road Networks - Methods and Evaluation. In Proc. of Joint Workshop Visualization and Exploration of Geospatial Data, Stuttgart, volume 36, page 4, 2007.Google Scholar
- Nahid Mohajeri and Agust Gudmundsson. The Evolution and Complexity of Urban Street Networks. Geographical Analysis, 46(4):345--367, 2014.Google ScholarCross Ref
- Shashi Shekhar, Michael R Evans, James M Kang, and Pradeep Mohan. Identifying Patterns in Spatial Information: A Survey of Methods. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):193--214, 2011.Google ScholarCross Ref
- Colin R Blyth. On simpson's Paradox and the Sure-thing Principle. Journal of the American Statistical Association, 67(338):364--366, 1972.Google ScholarCross Ref
- Jing Gao, Wei Fan, Jing Jiang, and Jiawei Han. Knowledge Transfer via Multiple Model Local Structure Mapping. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 283--291. ACM, 2008. Google ScholarDigital Library
- Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. A Survey of Transfer Learning. Journal of Big Data, 3(1):9, 2016.Google ScholarCross Ref
- Andrea Ballatore and Michela Bertolotto. Semantically Enriching VGI in Support of Implicit Feedback Analysis. In International Symposium on Web and Wireless Geographical Information Systems, pages 78--93. Springer, 2011. Google ScholarDigital Library
- Rob Kitchin and Gavin McArdle. What makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society, 3(1):2053951716631130, 2016.Google ScholarCross Ref
- Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald, and Edin Muharemagic. Deep Learning Applications and Challenges in Big Data Analytics. Journal of Big Data, 2(1):1, 2015.Google ScholarCross Ref
- Mahmuda Ahmed, Sophia Karagiorgou, Dieter Pfoser, and Carola Wenk. A Comparison and Evaluation of Map Construction Algorithms using Vehicle Tracking Data. GeoInformatica, 19(3):601--632, 2015. Google ScholarDigital Library
- Jaeeun Lee, Hanme Jang, Jonghyeon Yang, and Kiyun Yu. Machine Learning Classification of Buildings for Map Generalization. ISPRS International Journal of Geo-Information, 6(10):309, 2017.Google ScholarCross Ref
- Rodolphe Devillers, Alfred Stein, Yvan Bédard, Nicholas Chrisman, Peter Fisher, and Wenzhong Shi. Thirty years of Research on Spatial Data Quality: Achievements, Failures, and Opportunities. Transactions in GIS, 14(4):387--400, 2010.Google ScholarCross Ref
- Harvey J Miller. Tobler's First Law and Spatial Analysis. Annals of the Association of American Geographers, 94(2):284--289, 2004.Google ScholarCross Ref
- Hidetoshi Shimodaira. Improving Predictive Inference under Covariate Shift by Weighting the Log-Likelihood Function. Journal of statistical planning and inference, 90(2):227--244, 2000.Google Scholar
- Liang Ge, Jing Gao, Hung Ngo, Kang Li, and Aidong Zhang. On Handling Negative Transfer and Imbalanced Distributions in Multiple Source Transfer Learning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4):254--271, 2014. Google ScholarDigital Library
- Geoff Boeing. Osmnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks. Computers, Environment and Urban Systems, 65:126--139, 2017.Google ScholarCross Ref
- Mordechai Haklay. How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environment and planning B: Planning and design, 37(4):682--703, 2010.Google Scholar
- Silvana Philippi Camboim, João Vitor Meza Bravo, and Claudia Robbi Sluter. An Investigation into the Completeness of, and the Updates to, OpenStreetMap Data in a Heterogeneous Area in Brazil. ISPRS International Journal of Geo-Information, 4(3):1366--1388, 2015.Google ScholarCross Ref
- OpenStreetMap. Highways. https://wiki.openstreetmap.org/wiki/Highways, 2018.Google Scholar
- Geoff Boeing. Urban Spatial Order: Street Network Orientation, Configuration, and Entropy. 2018.Google Scholar
- Noam Segev, Maayan Harel, Shie Mannor, Koby Crammer, and Ran El-Yaniv. Learn on Source, Refine on Target: A Model Transfer Learning Framework with Random Forests. IEEE transactions on pattern analysis and machine intelligence, 39(9):1811--1824, 2017.Google Scholar
- Thomas G Dietterich. Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems, pages 1--15. Springer, 2000. Google ScholarDigital Library
- David Opitz and Richard Maclin. Popular Ensemble Methods: An Empirical Study. Journal of artificial intelligence research, 11:169--198, 1999. Google ScholarDigital Library
- Leo Breiman. Random Forests. Machine learning, 45(1):5--32, 2001. Google ScholarDigital Library
- Simon Scheider, Frank O Ostermann, and Benjamin Adams. Why good data analysts need to be critical synthesists. Determining the role of semantics in data analysis. Future generation computer systems, 72:11--22, 2017. Google ScholarDigital Library
- Hansi Senaratne, Amin Mobasheri, Ahmed Loai Ali, Cristina Capineri, and Mordechai Haklay. A Review of Volunteered Geographic Information Quality Assessment Methods. International Journal of Geographical Information Science, 31(1):139--167, 2017.Google ScholarDigital Library
Index Terms
- A transfer learning paradigm for spatial networks
Recommendations
Using community detection for spatial networks: POSTER
CF '19: Proceedings of the 16th ACM International Conference on Computing FrontiersThis paper describes the use of graph analysis for spatial networks. The use of community detection algorithms for detecting communities- groups of similar objects within networks of land cover objects to determine the land use is evaluated. Land cover ...
Spatial Data Mining and Analysis of the Distribution of Regional Economy
ETTANDGRS '08: Proceedings of the 2008 International Workshop on Education Technology and Training & 2008 International Workshop on Geoscience and Remote Sensing - Volume 02The aim of this paper is to study the regional economic difference with the spatial data mining theories. In this paper, we take the per capita agricultural total output value as index variable, and take the township as the basic analysis unit. Based on ...
Deep Learning Architectures Extended from Transfer Learning for Classification of Rice Leaf Diseases
Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial IntelligenceAbstractRice is one of the world’s five main food crops. The problem helps farmers identify diseases on rice leaves early and develop a plan to prevent diseases in time; at the same time, helping them reduce damage and increase crop yields is of great ...
Comments