Mining frequent trajectory patterns in spatial–temporal databases
Introduction
With advances in tracking technologies and the rapid improvement in location-based services, a large amount of spatial–temporal data has been collected in databases [8]. As a result, mining implicit and useful patterns in such databases has attracted increasing attention recently. The entities in a database that change their locations over time are defined as “moving objects”, where the movement (i.e., the trajectory) of an object can be described as a sequence of spatial coordinates, each of which is associated with a time stamp.
Finding the frequently repeated trajectory patterns can help us analyze and predict the movements of objects. Temporal information in the patterns makes many applications that are only concerned with the spatial information more realistic and practical. For example, using temporal information to understand customers’ movement (trajectory) patterns can help us make an appropriate recommendation or a mobile advertisement for them. In ecology, analyzing animals’ movement routes can help us better understand their behavior or detect a strange phenomenon or disaster, especially if some animals’ movement patterns change abruptly.
Therefore, in this paper, we propose an efficient algorithm that takes into account both spatial and temporal attributes to mine the frequent trajectory patterns. The proposed method comprises two phases. First, we scan the database once to generate a mapping graph and trajectory information lists (TI-lists). Then, we apply the proposed graph-based mining (GBM) algorithm to traverse the mapping graph and mine the frequent trajectory patterns in a depth-first search manner. Since our proposed algorithm uses the adjacency property to reduce the search space, we only need to extend a trajectory to the adjacent neighborhoods of the last location of the trajectory pattern. By using the mapping graph and TI-lists, the GBM algorithm can localize support counting and pattern extension to a small number of TI-lists. Thus, it is more efficient than the Apriori-based and PrefixSpan-based algorithms.
The main contributions of this study are summarized as follows: (1) We propose a spatial–temporal pattern to describe a trajectory, where both the spatial and temporal attributes are simultaneously considered. (2) By using the mapping graph, the GBM algorithm can localize support counting and pattern extension to a small number of TI-lists. (3) The GBM algorithm exploits the adjacency property to mine frequent trajectory patterns and reduce the search space. (4) We propose an efficient mining algorithm for discovering frequent trajectory patterns without candidate generation.
The remainder of this paper is organized as follows: Section 2 discusses the related work. Section 3 describes the preliminary concepts and the problem definitions used throughout the paper. Section 4 presents the proposed algorithm in detail. Section 5 illustrates the performance of our approach. Finally, Section 6 describes concluding remarks and future work.
Section snippets
Related work
Mining frequent itemsets is one of the fundamental problems in the area of data mining. An itemset is considered frequent if its support is not less than a user-specified minimum support threshold, where the support of an itemset is defined as the percentage of transactions in the database that contain the itemset. Many itemset mining algorithms [1], [15], [17], [24], [36], [38], [39] have been proposed for mining the frequent itemsets in a database. Agrawal et al. [1] proposed an Apriori
Preliminaries and problem definitions
A reference space of size e × e is a 2-dimensional space, where the reference space is a square, the length and width of the reference space are equal to e, e is an integer, and e ≧ 1. The x- and y-coordinates of a point gi in the reference space are denoted by gi · x and gi · y, respectively, where gi · x and gi · y are integers, 1 ≦ gi · x ≦ e, and 1 ≦ gi · y ≦ e. Consider a spatial–temporal database D = {T1, T2, … , Tu}, where Ti is the trajectory of a moving object, Ti = 〈(x1, y1, t1), (x2, y2, t2), … , (xm, ym, tm)〉, (xj, yj) is
The proposed method
In this section, we discuss the proposed GBM (graph-based mining) algorithm for mining the frequent patterns. As mentioned earlier, the proposed algorithm comprises two phases. First, we transform all trajectories in the database into a mapping graph. For each vertex in the mapping graph, we use a data structure, called a Trajectory Information list (TI-list), to record all trajectories that pass through the vertex. Then, we apply the proposed mining algorithm to find all frequent patterns. We
Performance evaluation
We conducted experiments to compare the proposed method with the modified MP (Moving Pattern mining) algorithm [9] (hereafter MP) and the modified PrefixSpan algorithm (hereafter PrefixSpan).
The original PrefixSpan algorithm is designed to mine sequential patterns over sequence databases and not trajectory patterns over trajectory databases, where time is not explicit as it is in a trajectory. Thus, the modified PrefixSpan algorithm needs to consider how to embed the time constraint into the
Conclusions and future work
In this paper, we have proposed an efficient graph-based mining (GBM) algorithm for mining frequent trajectory patterns. The algorithm comprises two phases. In the first phase, we transform all trajectories in the database into a mapping graph. Then, we create a TI-list for each vertex to record all trajectories that pass through the vertex. In the second phase, using the information recorded in TI-lists, the proposed algorithm traverses the mapping graph in a depth-first search manner to mine
Acknowledgements
The authors are grateful to the anonymous referees for their helpful comments and suggestions. This research was supported in part by the National Science Council of Republic of China under Grant No. NSC 97-2410-H-002-117.
References (41)
- et al.
Constraint-based sequential pattern mining: the consideration of recency and compactness
Decision Support Systems
(2006) Personalized local internet in the location-based mobile web search
Decision Support Systems
(2007)- et al.
Discovery of maximum length frequent itemsets
Information Sciences
(2008) - et al.
Mining maximal hyperclique pattern: a hybrid search strategy
Information Sciences
(2007) - et al.
Mining spatial association rules in image databases
Information Sciences
(2007) - et al.
Mining association rules with multi-dimensional constraints
The Journal of Systems and Software
(2006) - et al.
Efficient data mining for calling path patterns in GSM networks
Information Systems
(2003) - et al.
Flexible online association rule mining based on multidimensional pattern relations
Information Sciences
(2006) - et al.
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences
(2006) Efficient mining of weighted interesting patterns with a strong weight and/or support affinity
Information Sciences
(2007)
A new framework for detecting weighted sequential patterns in large sequence databases
Knowledge-Based Systems
Discovery of periodic patterns in spatiotemporal sequence
IEEE Transactions on Knowledge and Data Engineering
Mining sequential patterns with regular expression constraints
IEEE Transactions on Knowledge and Data Engineering
Cited by (103)
Changes in mobility amid the COVID-19 pandemic in Sapporo City, Japan: An investigation through the relationship between spatiotemporal population density and urban facilities
2023, Transportation Research Interdisciplinary PerspectivesA multi-agent system for solving the Dynamic Capacitated Vehicle Routing Problem with stochastic customers using trajectory data mining
2022, Expert Systems with ApplicationsGenders prediction from indoor customer paths by Levenshtein-based fuzzy kNN
2019, Expert Systems with ApplicationsGroup evolution patterns in running races
2019, Information SciencesAn indoor trajectory frequent pattern mining algorithm based on vague grid sequence
2019, Expert Systems with ApplicationsCitation Excerpt :In the data preprocessing phase, the trajectory sequence is transformed into a sequence of regions of interest, and regions of interest are divided into two categories according to the spatial discretization: default regions of interest and hot regions of interest. Lee, Chen, and Ip (2009) proposed a trajectory frequent pattern mining algorithm GBM based on graph, which firstly scans the database once to generate a mapping graph and trajectory information lists(TI-lists), and then traverses the mapping graph by depth-first walk to mine the trajectory frequent patterns. Luo, Tan, Chen, and Ni (2013) investigated the most frequent trajectory query based on time cycle to analyze the most frequent road choice of most pedestrians.