GPS is commonly used to implement robot localization, but GPS cannot be used in some conditions such as indoor environments [
1]. A common way to implement indoor localization is to estimate robots’ pose utilizing inertial sensors; however, this method is difficult to resolve the problem of robot’s wheel slor robot localization combining feature clusteipping, so the accumulated errors impact the estimating accuracy greatly [
2]. In magazine [
3], the Non-Cooperative Feature Points are extracted to implement the task of navigation, which is usually based on the satellite feature points. In magazine [
4], the graph-based methods have been carried out for the robot localization and navigation. In magazine [
5], the whole simulation process of robot navigation has been implemented. The drawback of the above methods is that it is difficult to obtain the robust feature points, which are easy to be affected by light changes. Recently, wireless sensors are used in mobile robot navigation as reference nodes; the methods can be divided into two categories [
6‐
8]: One is to implement robot localization utilizing wireless sensor networks on their own sensor nodes [
9], and the other is to carry out that on external targets [
10]. This article focuses on the later that is to implement robot localization utilizing the external sensor targets. Vision features are another commonly used in robot localization and motion estimation. According to the specific localization mechanism, existing wireless sensor network localization methods can be divided into two categories [
11]: A range-based method and a range-free method. The ranging-based positioning mechanism needs to measure the distance or angle information between the unknown node and the anchor node, and then use trilateration, triangulation, or maximum likelihood estimation to calculate the position of the unknown node. The non-ranging-based positioning mechanism does not require distance or angle information, or does not directly measure the information, and only implements node positioning based on network connectivity and other information. Localization based on ranging technology have the advantages that it can obtain higher accuracy, and the commonly used ranging technologies are RSSI (received signal strength indicator), TOA (time of arrival), TDOA (time difference of arrival) and AOA [
12]. TOA (angle of arrival) is the time of arrival method. TDOA is the time difference of arrival positioning method. Both are positioning methods based on the propagation time of the waves. It is necessary to have three base stations known at the same time to assist in positioning. The basic principle of TOA is to get the distances from the equipment to the three base stations after getting the three arrival times, and then just build the equations and solve them according to the geometry, so as to obtain the location value. TDOA does not solve the distance immediately, but calculates the time difference first, then establishes the equation group through some clever mathematics algorithm and solves, thus obtains the location value. Since the TOA calculation is completely time-dependent, the time synchronization of the system is required to be very high. Any small-time error will be amplified many times. At the same time, due to the influence of multipath, it will bring a great error, so the pure TOA is rarely used in practice. DTOA can greatly improve the accuracy of positioning because the ingeniously designed difference process cancels out a large part of the time error and multipath effect. Because DTOA has relatively low network requirements and high precision, it has become a research hotspot. AOA is the angle of arrival method, which is a two-base station positioning method that performs positioning based on the incident angle of the signal. It determines the position by intersecting two straight lines, it is impossible to have multiple intersections, which avoids the ambiguity of positioning. However, in order to measure the incident angle of the electromagnetic wave, the receiver must be equipped with a highly directional antenna array. RSS positioning method is based on the strength of the received signal to achieve positioning. In the positioning process, the signal intensity of three different reference points is measured by the device, and three distance values are calculated according to the physical model. Then, a geometric solution method similar to TOA can be used to obtain the positioning point. Common RF chips all have RSSI measurement function, so the RSSI mechanism is easy to implement. The problem is it is susceptible to channel and noise, and it has large measurement error in long-distance positioning, which is mostly used in small-range positioning [
13]. The visual information is also commonly used to implement the robot localization. This series of methods can be divided into monocular visual localization method and stereoscopic visual localization method [
14‐
16]. The purpose is to use the robot vision system to sense the environment, identify the location of the road signs and obtain local map information, and continuously use the obtained local map. The main problems existing in visual localization research are it is difficult to implement feature extraction and landmark recognition quickly and accurately through robot vision systems in complex environments [
17,
18]; the amount of image features required for landmark recognition is too large, especially in a complex and large-scale environment [
19,
20]. In the case of global maps, there is a problem of information explosion. This increases the complexity and uncertainty of the localization task. Wireless sensors are flexible for robot localization, but it is difficult to achieve high accuracy only by themselves. On the other way, vision sensors are easy to be affected by many factors: the degree of image distortion correction, the tracking error of the feature points, and so on [
21‐
23]. To improve the robustness of the robot localization, SIFT [
24] and super-sphere aggregation algorithm [
25] are selected as the vision features. The object of this paper is to develop a method integrating distinguished features of indoor environment and wireless sensor network to improve the effect of robot localization.