Geospatial Big Data: Challenges and Opportunities☆
Introduction
Geospatial data has always been big data. In these days, big data analytics for geospatial data is receiving considerable attention to allow users to analyze huge amounts of geospatial data. Geospatial big data typically refers to spatial data sets exceeding capacity of current computing systems. McKinsey Global Institute says that the pool of personal location data was in the level of 1 PB in 2009 and is growing at a rate of 20% per year [1]. This estimation did not include the data from RFID sensors and those stored in private archives. According to the estimation by United Nations Initiative on Global Geospatial Information Management (UN-GGIM), 2.5 quintillion bytes of data is being generated every day, and a large portion of the data is location-aware. Also, in Google, about 25 PB of data is being generated per day, and a significant portion of the data falls into the realm of spatio-temporal data [2]. This trend will be even accelerated since the world becomes more and more mobile in these days. As in Fig. 1, in India, the internet traffic from mobile devices already exceeded that from desktop computers [3].
Along with this exponential increase of geospatial big data, the capability of high performance computing is being required greatly than ever, for modeling and simulation of geospatially enabled contents. However, because of limited processing power, it has been hard to fully exploit high-volume or high-velocity collection of geospatial data in many applications. Recently, distributed, parallel processing on a cluster of commodity computers or a cloud such as Amazon EC21 has been becoming widely available for use, breaking the existing limitations on processing power. In addition, big data platforms such as Hadoop [4], Hive [5], and MongoDB [6] have been developed such that users can implement big data analytics software very easily on a distributed, parallel computing platform. It is obvious that these recent improvements are providing us with a lot of opportunities for advanced analytics for geospatial big data [7], [8], [9]. According to Garner's hype cycle in Fig. 2, geospatial big data analytics belongs to the stage of peak of inflated expectations as of July 2012 [10].
Geospatial big data or simply spatial big data are societal opportunities [11], [12]. The Millennium Project identified 15 global challenges that the human kind is facing as in Fig. 3 [13]. Many of them can benefit from geospatial big data. Shashi Shekhar [14], a renowned computer scientist, says that the seven challenges are related to geospatial big data, as indicated by boxes in the figure. For example, as for energy, eco-routing is one example that can save energy using geospatial big data. This technology minimizes fuel consumption rather than travel time or travel distance. For this purpose, eco-routing tries to find a route that avoids congestion, idling at red lights, turns and elevation changes, and so on. Compared to using the “Fastest Route” option, Ford researchers told that using the “Eco Route” option offered as much as 15% reduced fuel consumption in some of their vehicles [15]. Now, in many Ford cars, we can find the eco-route option, as in Fig. 4.
McKinsey Global Institute conducted a study on how big data can innovate our world [16]. As for geospatial big data, the study says “the use of personal location data could save consumers worldwide more than $600 billion annually by 2020.” One can find out users' current locations by tracking their mobile devices such as smart phones. The study mentioned geosocial networking services such as Foursquare used for locating friends and for finding nearby stores and restaurants, where many users check-in at various places and reveal their current location [17]. On the other hand, according to the study, the biggest consumer benefit will be obtained from time and fuel saving thanks to location-based services that, by taking account of real-time traffic and weather data, help driver avoid traffic congestion and recommend alternative routes. Location tracking can be done by using a driver's smart phone or a global positioning system (GPS) equipped with a car.
Section snippets
Power of location
Sir Martin Sorrell [18], the CEO of WPP Group, says “Location targeting is holy grail for marketers.” Big data analytics is an effective way to enhance the power of location [19]. For example, video rental services of Netflix can benefit from analyzing rental patterns of the regions designated by zip codes [20]. In Fig. 5, which is the result of this study, Netflix generated the data on the top-50 rentals in 2009 in each zip code. Titles were listed in the approximate order of popularity across
Data collection
There are several forms of geospatial big data. Traditionally, geospatial data can be categorized into three forms: raster data, vector data, and graph data [14]. First, the raster data include geo-images typically obtained by unmanned aerial vehicles, security cameras, and satellites. Recently, the military is collecting huge amounts of raster data by utilizing drones, and the satellites keep providing us with the remote sensing data of the Earth. Table 1 shows some of the climate and earth
On-going efforts
We recently initiated a new research project on spatial big data, supported by the Ministry of Land, Infrastructure, and Transport of the Korean Government. This project is planned for five years, and the outcomes will be integrated into the public services for Korean citizens. Fig. 9 shows the entire system architecture we are planning. The system consists of three layers: geospatial big data integration & management, geospatial big data analytics, and geospatial big data service platform. The
Conclusion
In this paper, we have discussed the challenges and opportunities which geospatial big data brought us. Many evidences have witnessed that a significant portion of big data is, in fact, geospatial big data. We can innovate our daily life and business by exploiting the power of location embedded in geospatial big data. A few cases are introduced to show the real benefits of geospatial big data. The collection of geospatial big data becomes pretty easy thanks to the advancements of sensor and
Acknowledgements
This research, “Geospatial Big Data Management, Analysis and Service Platform Technology Development,” was supported by the MOLIT (The Ministry of Land, Infrastructure and Transport), Korea, under the national spatial information research program supervised by the KAIA (Korea Agency for Infrastructure Technology Advancement) (14NSIP-B091011-01).
References (36)
Big data: the future is in analytics
- et al.
Spatiotemporal data mining in the era of big spatial data: algorithms and applications
2012 KPCB internet trends year-end update
Hadoop: The Definitive Guide
(2012)- et al.
Hive: a warehousing solution over a map-reduce framework
Proc. VLDB Endow.
(2009) - et al.
The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing
(2010) - et al.
Trajectory clustering: a partition-and-group framework
- et al.
Trajectory outlier detection: a partition-and-detect framework
- et al.
TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering
Proc. VLDB Endow.
(2008) Hype cycle for big data, 2012
Reality Mining: Using Big Data to Engineer a Better World
Big Data: A Revolution That Will Transform How We Live, Work, and Think
Global challenges for humanity
Spatial big data challenges
GPS systems that save gas
New ways to exploit raw data may bring surge of innovation, a study says
Glaucus: exploiting the wisdom of crowds for location-based queries in mobile environments
The power of apps
Cited by (339)
Harnessing big data for sustainable urban management: A novel approach to gridded urban GDP dataset development
2024, Journal of Cleaner ProductionLeveraging OGC API for cloud-based flood modeling campaigns
2024, Environmental Modelling and SoftwareBig data analytics capabilities: Patchwork or progress? A systematic review of the status quo and implications for future research
2023, Technological Forecasting and Social ChangeHow to manage massive spatiotemporal dataset from stationary and non-stationary sensors in commercial DBMS?
2024, Knowledge and Information Systems
- ☆
This article belongs to Visions on Big Data.