1 Introduction

Mobile devices are continuously upgrading features allowing usage of context data such as location-dependent data [1] in order to efficiently serve the client queries with valid answer from anywhere at any time. Route is an imperative travel property in most of the location-aware application. One can achieve information about route by regularly capturing Global Positioning System (GPS) [2] data via the cell phone connected to a server. It is necessary to manage the battery power resources efficiently if tracking for an entire day is required. Frequent visit position recalculations will yield higher exactness at the cost of significant energy. If one has to entertain unnecessary frequent information transmission, then it will increase the phone bill cost and utilize additional network resources such as batteries energy as well. On the other hand, infrequent or rare positions recalculations will build battery life, but are unable to receive accurate result from traveling user’s device real-time location [3] or travel route. A well practical dynamic location management algorithm with continuously tracking and GPS request rate-fixing ability of user location and user’s route construction needs to be implemented. A sample of GPS Log and Trajectory for Client Movement is given in Fig. 1. The problem of low sampling rate GPS Trajectories-based Map Matching (MM) [4] can be stated as follows.

Fig. 1
figure 1

Log and trajectory for client movement

Definition 1

GPS log: It is a point arrangement of Pt = pt1, pt2 …, ptm. Here, every point pti Pt contains pti.t, pti.lat, pti.lng as timestamp, latitude and longitude, respectively.

Definition 2

GPS trajectories: It is represented by T. It is a sequenced set of GPS focuses in which time interim between any two back-to-back GPS focuses not surpassing a settled limit known as threshold T, i.e., T: pt1pt2 ··· pm, where T > pti+k·k > 0 for (1 ≤ i < m) and pti L. m = |T| is the count of sample points, and k is known as the interval of these sample points.

Definition 3

Segment of Road: It is represented by symbol r and is a directed line associated with id information r.id, initial point r.start, an ending point r.end, length value r.l, travel speed r.v and intermediary point sets representing the road using a polygon. Figures 1 and 2 suggest actual segments of road in OpenStreet Map Search.

Fig. 2
figure 2

Illustration of road segments

Definition 4

Road network: It can be represented as G(V, R), where V is an arrangement of set of points where one road link is connected to others road link in the street portions and R can be viewed as an arrangement of set of directed lines or edges representing street sections, where each edge is associating with two vertices.

Definition 5

Path: Path P is an arrangement of associated street fragments that begins at vertex Vi and ends at Vj, i.e., P: r1r2 …… rn, where Vi = e1.start, Vj = en.end, em+1.start = em.end, 1 ≤ m ≤ n.

The structure of paper is as follows. Review of the problem definition, motivation and key contributions linked to this research is presented in Sect. 2. Followed by this, related works are listed in Sect. 3. The ArcGIS model used in the suggested map matching is described in Sect. 4. The PSO and FIS-based MM algorithms are described in Sect. 5. In addition, an explanation of the comparative study by simulation result is given in Sect. 6. Finally, the paper’s conclusion and future scopes of the research domain are given in Sect. 7.

2 Problem definition, motivation and key contribution

Assume a vehicle is moving along a finite system of streets, N; here, one does not know. Instead one has a road network representation G(V, E) of the system in ArcGIS map. For simplicity, it is assumed that there is a one-to-one correspondence between the streets in N and the arcs in G(V, E). The G(V, E) consists of piecewise linear arcs representing roads. Each arc, A ∈ E, can be described by finite sequence of points (A0, A1, …., An). The points A0 and An are called nodes. These are the endpoints of an arc and often points on which it is not possible to move to another arc. They correspond to dead ends or junctions in the street system. Additionally, the GPS device provides us an estimate of the vehicle’s location within a certain intervals known as GPS trajectory T. The actual location of the vehicle at time t is denoted by Pt and the estimated location by Pe. The goal is to discover the road A from G which corresponds to T with its real path by matching actual location Pt to the estimated location Pe. The chain of discovered consumer positions is aligned on virtual map by means of producing a directed and annotated street map from GPS lines (or traces). The MM problem is illustrated in Fig. 3. It is a preprocessing step in any location-dependent information system (LDIS).

Fig. 3
figure 3

Map matching problem

The need of map matching algorithm is important due to two reasons. First reason is due to the mistakes in the GPS device leading to wrong location of a vehicle. Second, due to transfer, storage and other costs, sampling rates are not all the time at high frequency, making it difficult to tell the route. Therefore, it is a vital operation for application spending to track the data, data management for traffic analysis, frequent path finder and taxi pickup recommending system.

Frequent map matching steps yield more accurate result at the cost of unnecessary information transmission and additional network resources such as batteries energy, network bandwidth. On the other hand infrequent position recalculations will result in inaccurate result from traveling user’s device real-time location. The sensor information is changing continuously in time and space according to the mobile client’s motion. When map matching error is high due to low bandwidth or some other system errors, then the system should have ability to locate the mobile user on map without compromising accuracy of the map matching. Therefore, a well practical dynamic location management algorithm with continuously tracking and GPS request rate-fixing ability of user location and user’s route construction needs to be implemented. To reduce error of map matching, sensor’s information is processed into the various map matching algorithms, e.g., particle swarm optimization (PSO), fuzzy logic, Bayesian inference, Kalman filter, Dempster method or the Shafer’s principle for improvement in the precision available at any time. The previous map matching policies did not considered the available inputs, e.g., heading, speed and vehicle’s historical trajectory, etc., missing the relation between the road ties. In fact, they ignore the sources of error correlated with the road network and the orientation system. Based on the above shortcomings, the paper suggests advance map matching algorithm based on fuzzy logic which considers various input variables such as velocity direction, horizontal dilution of precision, position output perpendicular distance from candidate edge, heading change, heading error, GPS fix interval time, number of epochs. PSO has a limited number of parameters. The effect of the parameters on the solutions is small compared to other methods of optimization. It is less reliant on a number of initial points than on other techniques of optimization. It can also tackle problems of the nonlinear, nonconvex, constant, discrete and integer form vector problems. Among various optimization techniques, we have first chosen PSO as a map matching optimization algorithm here due to its fastness and being more efficient in comparison with the other optimization algorithms.

The paper key contributions are summarized with the following points.

  1. 1.

    Integration of fuzzy inference and recommendation in MM algorithms to filter the errors of navigation system and get an accurate FIS-based navigation systems—various input variables used by fuzzy inference and recommendation in MM algorithms include velocity direction, horizontal dilution of precision, position output perpendicular distance from candidate edge, heading change, heading error, GPS fix interval time, number of epochs, etc.

  2. 2.

    Integrating PSO in MM algorithm of unreliable navigation systems—it has very high rate of positioning errors related to MM. For two datasets with extreme variation in direction and initial position PSO is used here to solve the local optimal problem to get better transformation outcomes. The system then uses part of the global map as the dataset of reference with overlapping points for corresponding map matching.

  3. 3.

    Involvement of false matching relationships pin the traditional MM method—here, PSO-based MM of road for running the vehicle is adopted which leads to reduction in false matching relationships. The corresponding precision of the PSO-based MM algorithm has arisen considerably. Moreover, fuzzy logic-based algorithm shows best map matching accuracy and most insensitive to the variation in sampling interval. This paper includes a comparative study of two proposed methods, i.e., FIS and PSO incorporation in MM algorithms in terms of recall and accuracy to that of conventional MM algorithms.

  4. 4.

    Providing a scheme to analyze whether implementation of the system will meet the specification or not—to achieve this goal, a shortest path discovery implementation, i.e., A* algorithm in the FIS, R-tree-based spatial indexing to save map information, included in the implementation of LBS. For implementation of the algorithm, the navigation system consists of 16-bit microcontroller 80C196KB, user interface system, route guidance system and a positioning system.

  5. 5.

    In the area where available bandwidth is very low or sometimes, completely gone. In this case, GPS request sampling rate fixing is very tough and possibilities of map matching error or more formally LBS map matching failure are very high. This paper is equipped with practical dynamic location management algorithm to support fault-tolerant approach with continuously tracking and GPS request rate-fixing ability of user location and user’s route construction.

  6. 6.

    Analysis of effects of different parameters related to FIS and PSO on system performance. By using a simulation for the probabilistic path forecast model, we have generated a big quantity of historical information for parameter training. In FIS-based MM, a field test for 20 min validates the performance of the algorithm. We have conducted an assessment of performance (precision and effectiveness) of MM method from actual driving information, whose path was already known.

3 Related work

One can achieve information about route by regularly capturing GPS data from the cell phone through a server. Navigation system involves deploying real-time position on road network which is obtained from given sensor, e.g., GPS. Various parameters contribute some positioning errors to these navigation systems, e.g., masking of GPS signal, errors related to receiver, propagation and/or also satellite. An observed GPS position has measurement errors due to GPS devices limitation and sampling errors due to low sampling rate [5]. Therefore, it is needed to be preprocessed to align it on a given virtual map. The challenges of MM differ depending upon the sampling rate and GPS accuracy. In practice, low sampling rate GPS trajectories are easily available. These GPS trajectories have up to one point every 3–5 min. But, current MM schemes deal with high sampling rate GPS trajectories only. These GPS trajectories typically have up to one point every 5–30-s GPS data. If one employs current MM schemes to low sampling rate points, it will become less effective as increment of uncertainty in data. Numerous strategies for sending cell phone location updates to a server have been examined in [6,7,8,9]. In polling survey, server pulls the coordinates (or any other location information) from the cell phone with certain interim. It is valuable for infrequent position refreshes, for instance, application showing the present location of gadgets on a map guide. This polling scheme is inefficient for real-time frameworks or where detailed route recording is necessary. This scheme has disadvantage that when interim (period) is set to be small; then, endless measure of pointless information can be sent to the uplink of the server. Again, if time interval set to be long, then framework will be more proficient; however, they do not address the issues of real-time applications. Zone-based map matching is another scheme in which the mobile node sends observed GPS point information when it goes into or leaves specific area. In distance-based strategies, mobile node sends GPS fixes only when mobile node has surpassed a separation limit or threshold. These schemes have disadvantage as well that it sends unnecessary location updates when mobile nodes follow straight line route. An another scheme to MM is dead reckoning that finds whether it needs to send the GPS fix to server solely based on the most recently sent location information or not. But, this scheme requires that the estimation function must be concurrently executed on the server and devices. If mobile device finds some deviation in its position from what is expected to be by executing estimation function based on most recent server information, then it will send the new location information to the server. This method is advantageous as it rarely transmits the location data, but it has one disadvantage as well because it requires the potentially costly estimation functions to be continuously executed on both the server and the mobile device which consumes more resources and unsuitable for real-time framework. In navigation, dead reckoning is the procedure of using so as to figure one’s present position a formerly decided position, or settle, and propelling that position based upon known or evaluated speeds over slipped by time and course. The tall buildings block the GPS signals. It is the urban canyon problem. Dead reckoning is still necessary for uninterrupted service. With no view of the GPS satellites, the simplest thing the software can do to use last known position and last known velocity and extrapolate forward in time. This approximation starts to fail quickly as one’s car speeds up or makes turns. Next best is to use the accelerometer, gyro and compass. The phone has these sensors. This gives the software information about car orientation and change of velocity. Fused with last known GPS position, this provides better dead reckoning, i.e., the extrapolated position remains accurate for longer time. What the car’s navigation unit has that one’s phone does not is the actual distance-traveled information and the wheel revolution count. This is a more accurate distance measurement than can be provided by the phone’s accelerometer. This is one reason why the car’s navigation computer is better than phone’s nave software in spotty GPS coverage situations.

To obtain extremely economical location-based services (LBS), an associative algorithm rule that mixes aspects of many update techniques and dynamically adjusts at run time is needed. In [10], critical point algorithm can be viewed as a hybrid algorithmic rule fulfilling this goal. An associative algorithmic rule is needed to manage the trade-off between GPS fix request rate and energy-supporting real-time application needs. The various approaches [5] for MM of GPS observation can be broadly categorized into the following classes.

  1. 1.

    Local/incremental method It is used to find local matches of geometries. To do so, it evaluates the candidate edges by summing the scores of corresponding individual similarity measures. It accounts for two types of score to evaluate candidate edge, namely distance similarity score and orientation similarity score. If one finds adjacent edges for each sample, then he can use adaptive clipping method as in [11] which it employs Dijkstra algorithm to find shortest path on local free space graph. The algorithm takes O(mnlogm) time, where n and m are number of GPS points and number of edges in the road network, respectively. In [12], author proposed a segment-based matching approach which gives confidence values to different sampling points. The matching is done in order of high confidence sampling points to the low confidence sampling points. While matching a new position, this method considers a small portion of the trajectory only which are near to the position. The performance of this method is very well if sampling frequency is high, e.g., 1–10 s. But if sampling rate reduced that is if the interval for sampling is increased, then it will show the problem of arc-skipping resulting in inaccuracy.

  2. 2.

    Global method This method is used to match the entire trajectories with the given road network. To achieve this goal, Frechet distance is used which records the continuity of curves. In [13], algorithm searches all the critical values through different parameters. The solution of decision problem is given by exploring a path from the lower left corner to the upper right corner in free space. The algorithm time complexity is (mnlog2 mn), where n and m are the number of nodes in the road network and number of edges, respectively. The effect of outliers is reduced using average Frechet distance described in the later work [13]. The aim of global methods is to minimize Frechet distance between the trajectory and the matched road segments.

  3. 3.

    Statistical method Statistical-based method for MM is gaining attention over decades. In [14], hidden Markov model was incorporated with Bayesian classifier for map matching of GPS fixes. In [15], cubic spline interpolation and the extended Kalman filter are combined with spatiotemporal matching methods to improve the system performance. Shortest path computations are evolved in the spatiotemporal matching policies. Dijkstra’s, search decomposition, A* algorithm [16], bidirectional search [17] and hierarchical search [18] are well-known shortest path computation algorithms. For faster computation, the shortest path computation performs preprocessing with policies such as ALT [19]. Euclidean distance method and ALT policy have the tighter lower bound than that of A* algorithm. Moreover, various other approaches are also came into existence such as in [20]. The author proposed method for intersection geometries extraction. In [20], centerlines lanes count is considered, and in [21], road type and speed limit are considered to MM. The trajectories may involve irregularities related to the acceleration, the speed traveled, direction changes and abrupt expected distance between points. So, a filtering step is necessary before the map generation process, in which traces are tested for any irregularities. One can replace points from the GPS trace by an interpolated point if given point does not satisfy the given criteria. The k-means algorithm, trace merging and kernel density estimation (KDE) are the classes of algorithms used for map generation. K-means algorithm is performed to deduce the set of GPS points by cluster centroids [22, 23], which are linked together to formulate road geometry. KDE approach estimates kernel density based on the GPS points and edges. This estimation of kernel density helps in geometry extraction for roads. Table 1 shows the summary of MM-associated technical papers.

    Table 1 Summary of map matching-associated technical papers

To filter the navigation system errors and get an accurate FIS-based navigation systems, it needs integration of fuzzy inference and recommendation in MM algorithms. Various input variables such as velocity direction, horizontal dilution of precision, position output perpendicular distance from candidate edge, heading change, heading error, GPS fix interval time, number of epochs could be used in fuzzy inference and recommendation in MM algorithms. Fuzzy logic-based algorithm used in various LBS problems shows better accuracy and is the most insensitive to the variation in sampling interval.

In the past, Olivas and Valdez [30] illustrated a summary of the various methods used for type-2 fuzzy particle swarm, bee colony and bat algorithms in controller intervals. Castillo et al. [31] also proposed a new type-2 fuzzy model architecture methodology. Valdez et al. [32] provided a comprehensive review of some optimization techniques, in which fuzzy logic takes care of parameter adaptation. In the field of intelligent control, Castillo et al. [33] identified the use of hierarchical genetic algorithms to automate fuzzy logic.

Moreover, integration of PSO in unreliable navigation systems can be done where there is very high rate positioning errors related to MM exist. Traditional MM method involves many false matching relationships. PSO-based MM of road for running the vehicle can be adopted which leads to reduction in false matching relationships. Melin et al. [34] developed a new method for changing the social and cognitive parameters of PSO (c1 and c2) with the aid of fuzzy logic. Valdez et al. [35] proposed a new optimization strategy that blends GA and PSO, utilizing fuzzy logic to integrate GA and PSO effects. In [36], Olivas and Valdez integrated fuzzy with PSO by suggesting interval type-2 fuzzy logic-based dynamic parameter adaptation in particle swarm optimization. For two datasets with extreme variation in direction and initial position PSO can be used to solve the local optimal problem to get better transformation outcomes. The system then can use part of the global map as dataset of reference with overlapping points for corresponding map matching.

4 ArcGIS

ArcGIS, as the name goes, is a geographic information system tool developed by ESRI back in 1999 and written in computer language C ++. Current releases of ArcGIS are versions 10.4.1 and 10.4.2. It is compatible with Android, iOS and Windows. It has varied uses starting with making/digitizing maps, compiling data, analyzing mapped data and most importantly managing all these information on a single database. Basically, it allows one to capture, store, analyze, manipulate and manage huge amount of data at one platform. The best part is that one can export the data into various other modeling of software to further explore possibilities and to play around one’s datasets. ArcGIS has millions of geoprocessing tools. ArcGIS is first starting platform in GIS Industry. All GIS and online application depend on this software. In the coming era, one can think that everyone will be dependent on GIS systems. For example in India, one can say his cities are being converted to Smart City, but converting a simple city to smart city, one needs a big GIS System. Apart from this, one needs to learn SQL PostGIS, R, QGIS, etc. These are the free application with great GIS features. An ArcGIS map facilitates us to create customized road networks, base maps, choose travel modes, asset locator, etc., with mobile system and navigator.

5 MM algorithms

MM is the operation of matching recorded geographic coordinates to a logical model of the real road networks making task easier of urban computing such as traffic systems. More than 10 algorithms have been tested in decades for this problem. But some of them are tested only on specific database, and that makes it hard to participate for the decision which algorithm to apply for different situations. To address this problem, a wide range of existing MM algorithms survey is given in [37, 38]. Based on the main techniques, one will put them in different categories and compare them through experiments of real-world and artificial dataset with different characteristics. One can also report vast finding obtained from the experiment and provides insights of the strength and weakness of existing MM algorithm which will help participant to apply the appropriate algorithm. In MM integration of digital map data with its GPS and dead reckoning (DR), i.e., odometer reading (∆d), errors associated with it (εθ), gyro-rate reading (∆θ), errors associated with it (εd) are responsible for required navigation performance. EKF algorithm is used by Quddus and colleagues [39] for this integration of GPS and DR. Finding the real location of any moving vehicle on a road is key requirement in any MM algorithm. GPS/DR outputs are vehicle speed (v), heading (θ), northing (N), easting (E) and the error variances, respectively (i.e., σv, σh, σe, σn). MM gives the output as correct link, the directional position solution (easting and northing of respective links) and uncertainty associated with it (σmm).

In map matching process, road hyperlinks having its confidence value more than the defined minimum confidence threshold (min_thresh) are considered as candidate link. If the candidate area does not incorporate any link, system takes the assumption that car is off-route of recognized avenue link. The general MM framework is depicted in Fig. 4. In a scenario, where the candidate area includes only one segment, the selection technique would be simple. If multiple links are presented around position fixes, then FIS, neural network or PSO can be used to perceive the right hyperlink among the candidate links. Any MM algorithm [40,41,42] consists of two sections. First section has the steps for correct link identification, and the second one has location determination steps on the selected link. The initial MM process steps involve position fix to the link perpendicular distance, link bearing, vehicle direction, GPS receiver first fix time, search space derived from the error variances, etc. Identification of the first link is made by few first link position fixes.

Fig. 4
figure 4

Map matching framework

5.1 Using fuzzy inference system

Computing can be categorized as hard computing as well as soft computing. Hard computing is related to numerical analysis, crisp systems and binary logic, while soft computing is related to probabilistic reasoning, fuzzy logic and neural networks. The current trend follows the use of fuzzy logic in mixture with genetic algorithms [43] and neurocomputing to instruct fuzzy membership functions [44]. Adaptive Neuro-Fuzzy Inference System (ANFIS) in MATLAB 6.5 can be used to optimize fuzzy membership functions. Membership functions determination is done by least squares or backpropagation method. Antecedent (input) to the consequent (output) connection in FIS is made by the If–then rules. The weights in FIS are defined based on their criticality. The basic fuzzy logic consists of some important steps which are shown in Fig. 5.

Fig. 5
figure 5

Fuzzy inference system (FIS)

The output of Mamdani-type FIS is fuzzy set, while the output of Sugeno-type FIS is either weighted expression or a constant.

  • e.g., Mamdani: If P is Y1 and Q is Y2, then R is Y3. (P, Q, R are fuzzy sets.)

  • Sugeno: If P is Y1 and Q is Y2, then R = aY1 + bY2 + c (weighted expression).

In order to implement fuzzy logic, one can use Mamdani fuzzy inference system (FIS). A FIS initially fuzzifies the value of the parameter which results the value to be in range of [0, 1]. For this, a membership function is defined. Various kinds of membership function can be used. But for the simplicity of the model, one can use the simplest type of membership function known as the triangular membership function (TMF). It fuzzifies vi ∈ V for each pi ∈ P. Further, the output generated by the inference system should also be defined by a variable in the form of fuzzy value means its range should fall between 0 and 1(inclusive). In our case, a triangular membership function for this output variable is used and the TMF can be defined mathematically as given in the following equation.

$${\text{Membership}}{\mkern 1mu} {\text{function}}\left( {\mu f} \right), = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {\quad x \le a} \hfill \\ {\frac{b - x}{c - b},} \hfill & {\quad a \le x \le c} \hfill \\ {\frac{b - x}{c - b},} \hfill & {\quad c \le 0 \le b} \hfill \\ {0,} \hfill & {\quad b \le x} \hfill \\ \end{array} } \right.$$

The representation of TMF is given in Fig. 6.

Fig. 6
figure 6

Graphical representation of triangular membership function

Another form of this membership function can be used according to the nature and behavior of the introduced parameters in which the equation simply maps the value between 0 and 1, µ(x) → (0,1), or one can say μ(x) maps the value vi ∈ v in the range of [0, 1]. This is called as fuzzification of parameters. Now, two boundary parameters a and b are also needed to be optimized and normalized for every boundary condition defined by the membership function of each parameter. So, if a parameter has n boundary condition, then one has to optimize 2n boundary parameters. In a more general term, if a parameter pi ∈ P has a bi number of boundaries, then the output generated by FIS is in fuzzy form which defines a limit on minimum and maximum values of it. So, if the range of T-value is Tmax to Tmin, then,

$${\text{Number}}\,{\text{of}}\,{\text{parameters}}\,{\text{to}}\,{\text{optimize}}\,\left( {\mathop \sum \limits_{i = 1}^{n} b_{i} } \right) + 2$$

The likelihood associated with a link is the crisp output which is obtained by weighted average method. The min() function is used to obtain degree of applicability (ωi) for each fuzzy rule. In order to calculate the likelihood value, the following expression is derived by considering certain range of membership function from a to b.

$$\begin{aligned} & P_{f1} = \frac{{v_{1} - a}}{b - a}\quad P_{f2} = \frac{{v_{2} - a}}{b - a} \ldots \ldots \ldots P_{fn} = \frac{{v_{n} - a}}{b - a} \\ & {\text{Range}}\,\left( {P_{f1} , \, P_{f2} , \, P_{f3} } \right) \to \left[ {0, \, 1} \right] \\ & P_{\hbox{min} } = {\text{Min}}\left[ {P_{f1} , \, P_{f2} , \, P_{f3} } \right] \\ & {\text{Likelihood - value}} = 10 \times \left( {1 - \frac{{\mathop \smallint \nolimits_{a}^{b} \left[ {P_{\hbox{min} } \times \left( {P_{\hbox{min} } \times \left( {b - a} \right) + a} \right)} \right]{\text{d}}x}}{{\mathop \smallint \nolimits_{a}^{b} [P_{\hbox{min} }] {\text{d}}x}}} \right) \\ \end{aligned}$$

Pfn represents the fuzzified value of any parameter Pn, and Pmin is the minimum value among all the parameter after fuzzification.

The various input variables which are used in fuzzy logic map matching algorithm are listed below.

  1. a.

    Moving object velocity v (m/s),

  2. b.

    Direction of velocity with respect to links

  3. c.

    Direction of movement with respect to links,

  4. d.

    HDOP (horizontal dilution of precision)

  5. e.

    Position output perpendicular distance from candidate edge PD (m),

  6. f.

    Heading change (obtained from the gyro) (HI)

  7. g.

    Heading error, HE (degree)

  8. h.

    GPS fix interval time (obtained from GPS)

  9. i.

    Number of epochs

The following types of fuzzy rule sets are involved in the fuzzy logic map matching algorithm.

  1. 1.

    Fuzzy rule set for initial link and vehicle position on the link.

  2. 2.

    Fuzzy rule set for the next map matching link followed by current link.

  3. 3.

    Fuzzy rule set for junction of the links.

While designing the rule base, dependency of likelihood value on Vi is considered which is predicted by simulating it individually. For each parameter, limited numbers of boundaries like HIGH, MEDIUM and LOW are defined according to fuzzified value of that parameter as depicted in Fig. 7. The selection of the boundaries is the matter of choice, and one can choose as many as it is required for implementing the system. Suppose S = {H, M, L}, where H ← High, M ← Medium, L ← Low, here S1[i] means ith element of S for the first parameter. In the same way, Sn[i] denotes ith element of S for nth (last) parameter and i ← 1, 2, 3. Table 2 depicts all set of rules that is used in the rule base. The general form of rules in the rule base is given below, in which S1, S2 and S3 denote set of boundary for the first, second and third parameters, respectively. The FIS MM single output is the likelihood of position fix to the link. In Sugeno, fuzzy map matching constants are taken for low as 10, average as 50 and high as 100.

Fig. 7
figure 7

Representation of boundary parameter of P

Table 2 Set of rules used in the rule base

The initial map position (IMP) fix is denoted by P as given in Fig. 8. FIS evaluates link 1 to be the real link for position fix P. Then in subsequent step the FIS, finding of the link 1 will be the real link for position fix Q and R. Therefore, it can be observed that the position fix S lies on the link 1.

Fig. 8
figure 8

Case of IMP

5.2 MM based on PSO

PSO-based approach and basic MM steps are combined to propose a hybrid model in this paper. The PSO-based approach is used to optimize the dataset to find best link in link space. The idea of PSO was taken from social psychology and computer graphics. When a group of birds go for the search of food, they continuously communicate to each other. Due to this communication process, they share the information to each other which one is the best path to follow and which one is the best location to be safe and to get food. Due to this social behavioral nature, if one finds out a desirable path to go, all the rest birds follow the path quickly. It is used to achieve the optimally (globally) best solution among the given number of solutions. PSO is a branch of an evolutionary computation and well known in optimization techniques. PSO is much similar to genetic algorithms. But unlike GA, it does not possess any evolution operators like mutation and crossover. It contains a lesser number of parameters so that the adjustment is easy here. It is used to solve both continuous and discrete optimization problems. In PSO, it maintains a number of potential solutions which are termed as the particles and the whole population of solutions called swarm optimization [45,46,47]. It was originated by British Military in 1940s. It is mathematical technique and used in finding the maxima or minima of certain function for a feasible region. It may be linear or nonlinear. As birds fly in 3D space, they search for food or location to survive; algorithm uses the particle’s (bird’s) information moving in n-dimensional space. Solutions are extracted from particle as it is treated as a solution in an N-dimensional space. The variables used in PSO are number of iterations, number of particles or swarm size, velocity and acceleration coefficients. PSO algorithm is easy to implement, and its performance measurement is better in comparison with other optimization algorithms. The parameters of PSO [48] are discussed below.

  1. 1.

    Swarm size The count of solutions (particle) within the swarm is termed as the swarm size. For ‘n’ particles, the swarm size will be n. With the increase in swarm size, the area of search space also increases, and further, the running time and computational complexity also overburden due to this.

  2. 2.

    Iteration numbers The count of iteration to achieve the goal depends upon the problem. Lesser iterations count may cause the end of the searching prematurely. Iterations count in higher amount makes process time-consuming.

  3. 3.

    Velocity components This is made up from the 3 of its components as inertia (previous movement direction data), cognitive component (ith particle past performances) and social component (ith particle performances related to a group of neighbor particle).

  4. 4.

    Coefficients of acceleration The c1 and c2 are the two acceleration coefficients, where c1 and c2 measure the degree to which a particle confidence is in itself and its neighbors, respectively.

  5. 5.

    Fitness function Fitness function is used to conclude how much fit the particle/solution is, i.e., it evaluates the quality of the candidate solution at its best.

5.2.1 Proposed fitness function for PSO algorithm

First, the collection of pairs of candidate sequences {P1, P2, …, Pi} from neighborhood matching pairs (NMP) is abstracted as particles for traffic MM. These i number of particles for each links are abstracted from the neighborhood matching links of current location fix of moving vehicle. Pi ∈ NMP, a particle can be defined as below.

$$\{ (A_{1} ,B_{1} ),(A_{2} ,B_{2} ) \ldots \ldots (A_{n - 1} ,B_{n - 1} ),(A_{n} ,B_{n} )\}$$

Ai ∈ road source network (RA) and is a point of the map trajectory, and Bi is the corresponding link in the road network. The trajectory point and road link pair are decided in the start of the evaluation of the road source network. In the highway network, single-node matching pairs correspond to 1-D of M-D particle space. M matching pairs represent a particular similarity comparison solution that corresponds to the particular particle place in M-dimensional space. The measurement of resemblance between nodes demonstrates that it is possible to express the similarity measurement model of road which is used to calculate the fitness of each neighborhood link from given trajectory. Here, higher is the fitness function value of a record, higher will be the fitness of that record and the best will be the instance among the records. Now, the proposed fitness function f for ith instance, i.e., particle pair similarity (PPSi), is expressed as given below.

$$f = {\text{PPS}}_{i} \left( {A_{j} , B_{k} } \right) = \varepsilon_{1} \times {\text{TOP }}_{\text{Sim}} \left( {A_{j} , B_{k} } \right) + \varepsilon_{2} \times {\text{GEO}}_{\text{Sim}} \left( {A_{j} , B_{k} } \right) + \varepsilon_{3} \times {\text{SEM}}_{\text{Sim}} \left( {A_{j} , B_{k} } \right)$$

where TOPSim is the degree of resemblance between road entities of the topological features. GEOSim is the resemblance between street entities of geometric characteristics. SEMSim is the similarity between street entities of semantic characteristics. ε1 ε2 and ε3 are assigned weightages for respective contributions.

The PSO-based MM algorithm flowchart is given in Fig. 9. PSO-based MM algorithm uses (subset of) road network base dataset (numeric) retrieved from UCI repository. Here, a subset of whole dataset including feature selection in preprocessing (to reduce number of attributes) is used. The fitness function and PSO parameters are being updated in each iteration; also, the classification accuracy is checked over WEKA. The default values of parameters used in PSO are given below.

$$\begin{aligned} & {\text{Position}}\,x_{i}^{t} = 0\quad \quad w = 0.729 \\ & {\text{Velocity}}\,v_{i}^{t} = 0\quad \quad {\text{rand}}_{1} = 0.4 \\ & C_{1} = C_{2} = 0\quad \quad {\text{rand}}_{2} = 0.6 \\ \end{aligned}$$

For each instance (record) set, Pbest is current fitness value and Gbest is best (highest) fitness value among all the particles (instance). Update position and velocity of each particle (instance) are given by the equation below.

$$\begin{aligned} & {\text{Here}}\,c_{1} = c_{2} = 2,\quad w_{i} = 0.729,\quad {\text{rand}}1 = 0.4,\quad {\text{rand}}2 = 0.6 \\ & {\text{Position}}\,x_{\text{id}}^{\text{old}} = 0\quad \quad {\text{Velocity}}\,v_{\text{id}}^{\text{old}} = 0 \\ & {\text{Particle}}\, ( {\text{instance)}}\,{\text{velocity}} = v_{\text{id}}^{\text{new}} = w_{i} \cdot v_{\text{id}}^{\text{old}} + c_{1} \cdot {\text{rand}}_{1} \cdot (p_{\text{id}} - x_{\text{id}} ) + c_{2} \cdot {\text{rand}}_{2} \cdot (p_{\text{gd}} - x_{\text{id}} ) \\ & {\text{Particle}}\, ( {\text{instance)}}\,{\text{position}} = x_{\text{id}}^{\text{new}} = x_{\text{id}}^{\text{old}} + v_{\text{id}}^{\text{new}} \\ & {\text{pbest}} = {\text{fitness}}\,{\text{value}}\,{\text{till}}\,{\text{current}} = x_{\text{id}}^{\text{new}} \\ \end{aligned}$$

The value after calculation is further checked for the gbest (i.e., the record among all, which have highest fitness value of particles). At this step, one can get matching pair similarity in one dimension. Although in real-time M dimension particle space, many particle matching pair’s changes happen. The PSO evolves several iterations to update particle fitness value synchronously according to each dimension position and velocity changes. The finalization in the entire particle position will be acknowledged at long last after final cycle, and the likeness of the final street can be portrayed toward the end when stop criterion is met.

Fig. 9
figure 9

Flowchart of PSO-based map matching algorithm

6 Simulation result of the FIS- and PSO-based map matching algorithm with example dataset

All algorithms have been implemented in Microsoft Visual Studio 2005 (C++) with ArcGIS on a computer with a 3.50 GHz Intel Core i7-4770 K CPU and 32 GB RAM. Map information is saved with R-tree-based spatial indexing [49]. To discover the shortest path in the FIS and PSO-based MM method, A* algorithm [16] was used. We have conducted two evaluations in this work. The first one was an assessment of performance (precision and effectiveness) of MM method from actual driving information, whose path was already known. By using simulation information, the second one assesses the enhancement in precision obtained through parameter training. By using a simulation for the probabilistic path forecast model, we have generated a large quantity of historical information for parameter training. In FIS-based MM, a field test for 20 min validates the performance of the algorithm. Speaking details of the hardware used in the implementation of the algorithm, the navigation system consists of 16-bit microcontroller 80C196KB, user interface system, route guidance system and a positioning system. The range of moving vehicle speed is from 10 to 50 km/h. The navigation system is L1-only coarse acquisition code capability, five parallel-channels, single-board, micro-tracer low power. The navigation system consists of a serial port RS232 for communication of mapped position from navigation system to computer and another sensor handling serial port for the GPS receiver. Table 3 shows the amount of nodes and the amount of road network data.

Table 3 Nodes and road network data

Table 4 represents confusion matrix for given PSO- and FIS-based map matching method.

Table 4 Confusion matrix in map matching

In this depiction of matrix, P and S represent the amount that the given method properly results the accurate match and the number that the method properly denies, respectively. The mismatches Q and R are the amount that the given method wrongly results as lawful match into accurate mismatch and lawful mismatch into accurate match. The scientists give some of the most popular metric schemes to measure the efficiency of algorithms such as precision, recall and accuracy. These are used in this article to assess the matching outcomes. The formula is as follows.

$$\begin{aligned} {\text{Positive}}\,{\text{precision}} & = \frac{P}{Q + R} \\ {\text{Negative}}\,{\text{precision}} & = \frac{S}{R + S} \\ {\text{Positive}}\,{\text{recall}} & = \frac{P}{P + R} \\ {\text{Negative}}\,{\text{recall}} & = \frac{S}{Q + S} \\ {\text{Accuracy}} & = \frac{P + Q}{P + Q + R + S} \\ \end{aligned}$$

Performance in terms of both precision and effectiveness was assessed. Recall is the proportion of correctly identified relevant pages, while precision is the proportion of the correct predicted relevant pages. The comparison in accuracy and recall rate between the traditional method and PSO-based MM algorithm is depicted in Fig. 10. Traditional MM method involves many false matching relationships. Here, PSO-based MM of road for running the vehicle is adopted which leads to reduction in false matching relationships. The corresponding precision of the PSO-based MM algorithm is increased from 51 to 82%. Further, the recall rate of the suggested algorithm has risen considerably from 60 to 84%. Experiments show that PSO significantly outperformed in accuracy and recall to those of previous algorithms due to the parallelism of its own algorithm as given in Table 5. F-measure (F1) is a function of precision, and recall can be given by the following equation.

$$F1 = 2 \times \frac{{{\text{Precision}} \times {\text{Recall}}}}{{{\text{Precision}} + {\text{Recall}}}}$$

Running time in this work was the point assignment, which is for online MM techniques and is the most significant precision metric. Many scientists suggested efficiency-enhancing methods. Figure 11 displays the running of the different sampling interval techniques. The horizontal axis reflects the number of trajectories used in sampling, and running time is represented by the vertical axis. The runtime performance comparison of all four algorithms is given in Table 6.

Fig. 10
figure 10

Comparison of traditional method and PSO-based MM algorithm

Table 5 Comparison of traditional MM and PSO-based MM algorithm
Fig. 11
figure 11

Evaluating running time

Table 6 Effects of no. of trajectories on running time of traditional and fuzzy logic-based MM algorithm

We have compared the 3 algorithms, namely GridST [50], statistic [51] and ST matching [5], with proposed FIS-based MM algorithm. We have used Road network of Delhi, India, having lakes of vertices and edges on it. We implemented synthetic trajectory dataset for the simulator by randomly selecting starting and endpoint with consideration of certain fixed range of vehicle speed. The runtime performance of all four algorithms shows similar trends, i.e., it is proportional to number of trajectories in navigation system. ST matching and GridST algorithms involve matrix calculation which is time-consuming. Therefore, they have higher runtime and less efficiency than fuzzy logic algorithms and the statistic.

Figure 12 displays the effectiveness of different sampling intervals of the techniques. The horizontal axis reflects the sampling interval, and accuracy is represented by the vertical axis. The accuracy performance comparison of all four algorithms is given in Table 7.

Fig. 12
figure 12

Evaluating impact by sampling rate on accuracy

Table 7 Effects of sampling interval on accuracy of traditional and fuzzy logic-based MM algorithm

The four tested algorithms exhibit similar matching accuracy trends, i.e., it is proportional to length of sampling interval in navigation system. From graph, we conclude that the technique of matching the FIS map is more accurate than earlier techniques. Due to similar basic logic, GridST and ST matching algorithms present similar matching accuracy. Statistics MM algorithm has the worst matching accuracy, while fuzzy logic-based algorithm shows best map matching accuracy [52] and most insensitive to the variation in sampling interval [53].

Effect of number of A* map matching queries, speed and trajectory length (in km) on accuracy, running time and memory tested on one-day dataset containing 13,671 trajectories and 155,068 points is given in Tables 8 and 9.

Table 8 Effect of number of A* MM queries, speed and trajectory length (in km) on accuracy and running time on one-day dataset containing 13,671 trajectories and 155,068 points
Table 9 Effect of speed and trajectory length (in km) on accuracy, running time and memory tested on the 15-day dataset containing 322,347 trajectories and 3,402,360 points

7 Conclusions

The MM algorithm involves correlating sampling point to best suitable location on real map to find actual route traversed by the vehicle. Many MM algorithms have been introduced in previous decades. This article provides a comparison of online MM techniques using FIS and PSO with likelihood or similarity measurement. Experiments show that PSO significantly outperformed in accuracy and recall to those of previous algorithm due to the parallelism of its own algorithm. Further, the fuzzy logic-based MM algorithm achieves a significant performance improvement in terms of running time and accuracy as compared to previous cache replacement policy such as GridST, statistics and ST matching for LBS.

The avenue of the further research is to automate heterogeneous data sources cleaning processes such as emergency vehicles, public transportation vehicles, taxis and cell phones device in MM navigation system that will overcome the challenges faced by MM application in LBS and further focus on user privacy issues. As future work, one can use large-scale, real-world datasets in fuzzy logic and PSO-based MM that needs enormous data cleaning effort. Moreover, multi-layer perceptron, Markov predictor, decision tree classifier and sequential pattern mining could be integrated to design an assistive technology (recommender system) in healthcare for helping users with intellectual, physical or sensory-diminished capabilities persons by reducing moving object navigation errors in LBS. Implementation of recommender system should integrate a predictive computing feature to the system, so that it would be capable of predicting a user error and alerting him in advance.