Skip to main content

2014 | Buch

Mobility Data Management and Exploration

verfasst von: Nikos Pelekis, Yannis Theodoridis

Verlag: Springer New York

insite
SUCHEN

Über dieses Buch

This text integrates different mobility data handling processes, from database management to multi-dimensional analysis and mining, into a unified presentation driven by the spectrum of requirements raised by real-world applications. It presents a step-by-step methodology to understand and exploit mobility data: collecting and cleansing data, storage in Moving Object Database (MOD) engines, indexing, processing, analyzing and mining mobility data. Emerging issues, such as semantic and privacy-aware querying and mining as well as distributed data processing, are also covered. Theoretical presentation is smoothly interchanged with hands-on exercises and case studies involving an actual MOD engine. The authors are established experts who address both theoretical and practical dimensions of the field but also present valuable prototype software. The background context, clear explanations and sample exercises make this an ideal textbook for graduate students studying database management, data mining and geographic information systems.

Inhaltsverzeichnis

Frontmatter

Setting the Scene

Frontmatter
Chapter 1. Introduction
Abstract
Space and time: the two axes according to which our lives are evolving. Every physical object has its own location (in space), a location that may change as time passes. This is how mobility is formed and governs our lives. Think of what ‘frozen’ time would mean; but this turns out to be philosophical discussion, which is for sure beyond the scope of this book… Database industry has for years been able to efficiently support time and space, though independently (the so-called, Spatial and Temporal Databases—SDB and TDB, respectively). However, it is obvious that these two axes of information find many interesting applications, if handled in conjunction. When we have in mind applications (e.g. cadastral systems) that consider spatial objects, which may change their shape or location discretely, from time to time, then we usually call them Spatio-Temporal Databases (STDB) whereas those that consider continuous or at least very frequent changes of objects’ locations are classified under the term Moving Object Databases (MOD). In the latter case, the main content of the database is the so-called mobility data, i.e. information about the movement of objects, which includes, at least, location and time information. In this chapter, we preview the concept of mobility data and briefly discuss what can we learn from such data collections. We summarize by discussing the transition from—stationary—spatial to mobility data management and the challenges that emerge.
Nikos Pelekis, Yannis Theodoridis
Chapter 2. Background on Spatial Data Management and Exploration
Abstract
Before studying mobility data, we have to make a short tour at the (stationary) spatial domain. For decades, spatial information has been studied thoroughly; from Cartography and Geodesy to Geographical Information Systems (GIS) and Spatial Database Management Systems (SDBMS); this is justified due to its importance and ubiquity in our everyday lives. Database community has followed the paradigm of extended DBMS and provided inherent spatial functionality in geographical data collections by developing spatial data types, operators and methods for querying, as well as indexing techniques. At the exploration level, multi-dimensional online analytical processing (OLAP) and knowledge discovery in databases (KDD) have attracted excellent results at the spatial domain. In this chapter, we review spatial database management (modeling, indexing, query processing) and exploration aspects (data warehousing and OLAP analysis, data mining), followed by a short discussion on data privacy aspects. This is essential knowledge in order for the reader to get familiar with background terms and notions during the corresponding discussion in the mobility data domain, in the chapters that will follow.
Nikos Pelekis, Yannis Theodoridis

Mobility Data Management

Frontmatter
Chapter 3. Modeling and Acquiring Mobility Data
Abstract
The vast spread of GPS equipped mobile devices, such as smart phones, GPS navigation devices etc., combined with the development of appropriate techniques for storing, processing, querying, and mining such kind of data, has resulted in the production of huge amounts of location-aware information. However, getting this kind of raw data into a meaningful form in terms of mobility is not a straightforward task at all. During this transformation procedure, many aspects arise, such as the treatment of inaccurate or noisy information, the identification of trajectories as sequences of sampled positions, the reduction of the size of the datasets in order to deal with the storage challenges that may appear, etc. Furthermore, in order to be able to evaluate the performance of spatiotemporal algorithms and data structures, enormous amounts of real-world mobility data are required, which cannot be easily found available out there. To bridge this gap, many generators of moving object trajectories have been developed. This chapter provides a review on the above-mentioned research field and is organized as follows. After a necessary discussion on mobility data modeling (the concepts of time-stamped locations, trajectories, etc.), we refresh our discussion on collecting raw mobility data through GPS devices and handle aspects that arise, such as noise and inaccuracy. Then, we introduce the trajectory reconstruction problem and the most popular techniques dealing with this issue (trajectory identification, map-matching, compression). We conclude by familiarizing the reader with the notion of mobility data generators along with a discussion on several developments on that field, for movement either in free space or under network constraints.
Nikos Pelekis, Yannis Theodoridis
Chapter 4. Mobility Database Management
Abstract
Adding temporal information, as an extra attribute in spatial databases, is not as straightforward as it may appear at a first glance. Time is not yet another dimension besides the two (or three, in some applications) spatial dimensions; monotonicity, for example, is a key difference. Could we adopt “as-is” methods and techniques for spatial databases, such as the ones outlined in Chap. 2? The answer is rather not, and this has been argued extensively in the spatiotemporal database literature. Therefore, novel data types (e.g., moving points), query processing techniques (e.g., “search for trajectories that ‘entered’ an area during a timeframe” or “search for trajectories that are ‘similar’ with respect to a reference trajectory”) and indexing methods (most probably, extensions of the well-known R-tree) have been explored. This chapter surveys the above aspects, which are essential components of a database system targeting at efficiently handing mobility data. In particular, interesting location- and mobility-aware queries are overviewed. Then, at physical level, selected indexing and query processing techniques that have been designed to efficiently support the above models and query types are presented.
Nikos Pelekis, Yannis Theodoridis
Chapter 5. Moving Object Database Engines
Abstract
Spatial database research has focused on supporting the modeling and querying of geometries associated with objects in a database. Regarding static spatial data, the major commercial as well as open source database management systems (e.g. DB2, MySQL, Oracle, PostgreSQL, SQL Server) already provide appropriate data management and querying mechanisms that conform to Open Geospatial Consortium (OGC) standards. On the other hand, temporal databases have focused on extending the knowledge kept in a database about the current state of the real world to include the past, in the two senses of “the past of the real world” (valid time) and “the past states of the database” (transaction time). The recent years’ effort is an attempt to achieve an appropriate kind of interaction between both sub-areas of database research. Spatiotemporal databases are the outcome of the aggregation of time and space into a single framework. As delineated in several surveys in the literature of spatiotemporal databases, a serious weakness is that the majority of the proposed approaches deals with few common characteristics found across a number of specific applications. Thus the applicability of each approach to different cases, fails on spatiotemporal behaviors not anticipated by the application used for the initial model development. For the previous reason, the field of the Moving Objects Database (MOD) has emerged and has shown that it presents the most desirable properties among various proposals. However, although a lot of research has been carried out in the field of MOD, most of the efforts do not pay attention into embedding the proposed algorithms (i.e., access methods and query processing techniques) on top of existing DBMS where real-world organizations base on. The goal of this chapter is to describe effective frameworks capable of aiding either an analyst working with mobility data, or more technically, a MOD developer in implementing a MOD in a real DBMS.
Nikos Pelekis, Yannis Theodoridis

Mobility Data Exploration

Frontmatter
Chapter 6. Preparing for Mobility Data Exploration
Abstract
Exploring mobility data already collected and stored in efficient database systems is the next step of mobility data management. Historical data ‘hide’ a treasure of ‘buried’ knowledge that ‘asks’ for mining. To do so, the typical Knowledge Discovery in Data (KDD) process typically includes the organization of historical information in a Data Warehouse (DW), a first level of analysis exploiting on data cubes build upon the DW, according to a multi-dimensional model, and, then, a deeper look into the data in order to extract models and patterns that data obey or follow, using data mining techniques. In this chapter, we provide the preparatory actions in order for data mining to follow in the next chapter. In particular, we present DW approaches for mobility, especially for trajectory data (Sect. 6.1), discuss about the kind of multi-dimensional analysis that is suitable for mobility data and the challenges that arise due to its peculiarity (Sect. 6.2), and present a methodology for progressive, interactive analysis that is useful to mobility data scientists (Sect. 6.3). Finally, we provide sound definitions for trajectory similarity, which is a key component of whatever analysis to be made with trajectory databases (Sect. 6.4).
Nikos Pelekis, Yannis Theodoridis
Chapter 7. Mobility Data Mining and Knowledge Discovery
Abstract
Knowledge discovery in trajectory databases is full of success stories in discovering interesting behavioral patterns of moving objects that can be exploited in several fields. Example domains include traffic engineering, climatology, social anthropology and zoology, implying application of the various mining techniques in vehicle position data, hurricane track data, human and animal movement data, respectively. Mobility data mining can be categorized according to the underlying mining methods used to discover the various collective behavioral patterns. Following this categorization method, there have been proposed works that try to identify various types of clusters of moving objects. Some methods group trajectories by considering the whole lifespan of the moving objects, while others try to identify local patterns that are valid only for a portion of their lifespan. Another line of research, which is parallel to that of clustering, focuses on representing a dataset of trajectories via an appropriate small set of objects, which are either artificial (i.e. the representatives or centroid trajectories of the clusters), or selected from the dataset itself (i.e. by some sampling methodology). Although clustering-oriented approaches prevail in the literature, there are many other interesting techniques that exhibit semantically rich mobility patterns and make the domain active in many areas of knowledge discovery. Among them, in this chapter we discuss sequential trajectory patterns discovery, classification and outlier detection techniques. The problem of predicting the future location of the moving objects has also been tackled and presented interesting results.
Nikos Pelekis, Yannis Theodoridis
Chapter 8. Privacy-Aware Mobility Data Exploration
Abstract
The increasing availability of data due to the explosion of mobile devices and positioning technologies has led to the development of efficient mobility data management and mining techniques. However, the analysis of such data may enhance significant risks regarding individuals’ privacy. Consider for example a user requesting a service for nearby points of interest (POI), such as restaurants or pharmacies. Even if hiding user identifier, the request contains enough information to identify the requester. By linking exact coordinates sent to the service provider with public available information about POI’s, a third party can increase the probability that the request was sent e.g. from user’s home. Consequently, location data should be kept confidential since its disclosure may represent a brutal violation of privacy protection rights. Moreover, developing techniques able to analyze and extract significant patterns from traces left by moving objects can provide insight to the data holders and support to decision-making and strategic planning activities (consider, for instance, patterns depicting typical movement behavior of people moving in an urban environment and how these patterns evolve over time). For this reason, publishing mobility data for analysis purposes is an unavoidable need. But what kinds of privacy threats rise if a MOD is released? By linking an anonymous MOD with public available information, is a malevolent user able to conclude personal behaviors or, even worse, uniquely re-identify the user behind a trajectory? This chapter provides a survey regarding privacy-preservation techniques for location and moving object data. In particular, we discuss the challenges with respect to privacy on mobility data, focusing on three categories of privacy-preservation techniques, namely (a) privacy in the context of Location-based Services (LBS), where a trusted server aims at providing the service without threatening the anonymity of the user requiring the service, (b) privacy-preserving mobility data publishing, where the goal is to release a sanitized version of the original MOD for public use, and (c) privacy-aware mobility data querying, where the focus is on providing anonymous answers to queries posed by the users to a MOD that is maintained in-house.
Nikos Pelekis, Yannis Theodoridis

Advanced Topics

Frontmatter
Chapter 9. Semantic Aspects on Mobility Data
Abstract
The bang of mobility data (due to the evolution of positioning devices such as GPS-enabled smartphones and tablets, on-board navigation systems in vehicles, vessels and planes, smart chips embedded in animals, etc.) has an equal share in what is called the BIG DATA era that raises important issues for Moving Object Databases (MOD) and Trajectory Data Warehouses (TDW), which are responsible for the operational and analytical, respectively, processing of moving object trajectories. A reasonable question that arises, is whether we really need all this detailed (i.e., point-by-point) information in order to perform the above processing effectively (i.e., having advanced mobile-aware applications and services in mind)? Trying to address this question, during the recent years mobility data are accompanied by semantic information (such as diaries filled in manually by citizens for urban transportation research purposes). In a different scenario, semantic information may be inferred by methods taking into account contextual information from the underlying application scenario. Thus, the answer to the above question may be simple: extract and manage (the necessary) semantics from movement and provide services and applications that are built upon them. In this chapter, we first present the background knowledge that allows us to swift the paradigm from raw trajectories to their semantic counterpart, and, subsequently, we study several methods that support a step-by-step methodology towards the reconstruction of the semantically enriched trajectories. The previous reflect the majority of the approaches that have been pursued in the literature, which tackle the raised issues from a conceptual point of view. Then, we go one step further by providing a blueprint of a prototype framework for designing and building real-world semantic-aware MODs and TDWs. Finally, we discuss the semantic aspects of privacy as an orthogonal dimension to the aforementioned techniques.
Nikos Pelekis, Yannis Theodoridis
Chapter 10. The Case of Big Mobility Data
Abstract
Trillions of bytes of information are being collected by companies about their customers, suppliers, and operations. Devices, such as mobile phones, tablets, smart energy meters, automobiles, industrial machines, carry millions of networked sensors which create huge portions of data that are channeled in the Internet of Things (IoT). In addition, the massive participation of individuals on social networking sites will continue to fuel exponential data growth. Hence, the enormous amount of data being produced in our world during the last decades has posed new challenges in the world of data management. Various methods and technologies have been developed and adapted in order to store, manage, analyze, and visualize the complex vast quantities of data in an efficient way. In this chapter, we introduce the reader to the concept of Big Data management and, then, we highlight interesting works made so far on the spatial and mobility domains. The area is still in its infancy and a strong wave of research results and, as a consequence, commercial products is expected to come up the forthcoming years.
Nikos Pelekis, Yannis Theodoridis

Epilogue, Hands-on

Frontmatter
Chapter 11. Epilogue
Abstract
It was 10 years ago when a seminal research activity dedicated to spatio-temporal databases concluded as follows: “… CHOROCHRONOS has opened many avenues for research in spatio-temporal databases, but it also left us with lots of challenging research problems awaiting solution … As an epilogue to this book, we would like to challenge the reader by discussing three important application areas and the role spatio-temporal databases can play in these … Mobile and Wireless Computing … Data Warehousing and Mining … the Semantic Web …”. Since then, all three then-open topics have been extensively studied and nice results have appeared in the literature. Also in this book, the reader may find related material in Parts I, II, and III, respectively.
Nikos Pelekis, Yannis Theodoridis
Chapter 12. Hands-on with Hermes@Oracle MOD
Abstract
Moving Object Database (MOD) engines enable us to process, manage and analyze mobility data. One of the already presented systems that satisfy these requirements is the Hermes MOD engine. Hermes provides MOD functionality to OpenGIS-compatible state-of-the-art Object-Relational DBMS. Currently, Hermes comes in two implementations; the first operates on top of Oracle DBMS (denoted as Hermes@Oracle) and the second operates on top of PostgreSQL (denoted as Hermes@Postgres). In Chap. 5, we studied the collection of abstract data types (ADT) and their corresponding operations of the Hermes@Oracle, which were defined, developed and provided as a data cartridge extending Oracle’s SQL query language with MOD semantics. The present chapter contains a detailed description of a hands-on experience over a dataset of synthetic trajectories (produced by the Hermoupolis generator) moving in Attiki, Greece. The methodology that we will follow is the one presented in Chap. 1, Fig. 1.​6. There are several sample queries as well as analytical and mining procedures running over the Attiki dataset and showing Hermes@Oracle API in action. Regarding installation guidelines for Hermes, as well as the specific dataset used in the current hands-on, and the Hermoupolis generator for producing datasets with different properties, the interested reader is referred to Hermes@Oracle homepage.
Nikos Pelekis, Yannis Theodoridis
Chapter 13. Hands-on with Hermes@Postgres MOD
Abstract
Marine environment is very important in our society with implications in economy and natural preservation. Marine accidents that happen from time to time cost high amount of money to the involved parties and, more important, incalculable damage to the ecosystem. Unfortunately, there is lack of monitoring vessels at sea for conforming to the international marine safety regulations. To address this issue, the Automatic Identification System (AIS) had been made mandatory equipment for commercial and fishing vessels (above a threshold length). A vessel’s AIS transceiver gathers data from onboard sensors like GPS and compass and broadcasts it publicly so that other vessels at sea or base stations at shore are aware of the position (latitude/longitude) as well as other trip-oriented information of the vessel under consideration. This chapter presents a showcase on a real AIS dataset and exploits the capabilities of Hermes MOD engine to efficiently process this kind of data.
Nikos Pelekis, Yannis Theodoridis
Backmatter
Metadaten
Titel
Mobility Data Management and Exploration
verfasst von
Nikos Pelekis
Yannis Theodoridis
Copyright-Jahr
2014
Verlag
Springer New York
Electronic ISBN
978-1-4939-0392-4
Print ISBN
978-1-4939-0391-7
DOI
https://doi.org/10.1007/978-1-4939-0392-4