Open Access 2022 | OriginalPaper | Buchkapitel

# Efficient Location-Based Tracking for IoT Devices Using Compressive Sensing and Machine Learning Techniques

verfasst von: Ramy Aboushelbaya, Taimir Aguacil, Qiuting Huang, Peter A. Norreys

Erschienen in: High-Dimensional Optimization and Probability

## Abstract

## 1 Introduction

## 2 Signal Representation

### 2.1 Principles

_{n}. Their path can be thus represented as a 2 × N vector of (λ[n], ϕ[n]). With each pair representing their location at a specific sampling point. This vector shall be then referred to as the fully sampled path.

### 2.2 Model

_{e}). This noise level may be constant for each data point on the path or it can vary between points in order to simulate the effect of transient noise sources.

## 3 Compressive Sensing

### 3.1 Basics

_{i}}

_{i ∈ [1,n]}. This means that our signal can be written as

_{i}are the sparse components of the vector in the new basis. The problem then can be modeled as an ℓ

_{0}-minimization subject to a constraint.

_{0}norm is commonly defined as the number of non-zero components of a vector which means that it quantifies the true sparsity of the vector. Using this paradigm, this means that we can treat path reconstruction as multidimensional constrained optimization problem.

### 3.2 Sparsity

_{0}, although it models perfectly the concept of sparsity, is actually poorly adapted for real-world use for many reasons. For example, realistic signals are rarely exactly sparse, in the sense that they rarely have mostly zero-valued components in some basis. In most applications of CS, one tends to consider pseudo-sparsity, where in some basis most of the components are much smaller than the few major components of the signal. This way, the former can be safely ignored while insuring that most of the important information of the signal is captured and thus it can be accurately reconstructed. Furthermore, the ℓ

_{0}-norm is a pseudo-norm that is non-differentiable. Last but not least, such solutions are very sensitive to noise and, in real-world scenarios, signals are always noisy.

_{1}-norm instead. Hence, Eq. (4) can be rewritten as:

### 3.3 Discrete Cosine Transform

### 3.4 Measurement Matrix

_{e}, i.e., it treats all data points as being equal. This is, of course, not the case in a real-world application, where σ

_{e}varies along the path. Also, the fact that the sampling is based on random number generation means that the effective sampling ratio is almost never the same as the predetermined ν. This is detrimental to some applications that require a fixed known sampling ratio. These shortcomings will be addressed in the following sections using the so-called adaptive sampling schemes. The complete process for generating the subsampling matrix is illustrated in Fig. 3.

### 3.5 Block-Wise Reconstruction

## 4 Lasso

### 4.1 Algorithm

_{2}-norm, which has the smallest ℓ

_{1}-norm. As previously discussed, the ℓ

_{1}-norm here ensures that we are biasing the regression toward more sparse vectors, with the degree of bias being controlled by μ. Expressing the problem as Eq. (9) turns it into a linear regression problem and allows the use of traditional LASSO regression solvers. For this case, we have used the conventional Coordinate Descent algorithm which is known to be quite effective for the LASSO problem and easily implementable across various platforms [25]. The mathematical details of this algorithm are beyond the scope of this chapter but can be found in the literature [24].

### 4.2 Parameter Optimization

_{i}, y

_{i}) Web Mercator coordinates and all variables with a circumflex represent reconstructed quantities. When it comes to finding the optimal block length, there is a trade-off to be made between complexity and performance. It should be noted that, in evaluating the performance of the algorithm we consider the deviation of the full reconstructed path and not just of the individual blocks. Intuitively, the higher the sampling ratio, the better the performance since more samples are included and the CS scheme becomes just a denoising operation. However, this is not desired as the main application is sub-sampled reconstruction, therefore the goal is to find the minimum possible sampling ratio that still allows a reasonable reconstruction. Clearly, the shorter the block length, the faster each block-wise iteration is. Additionally, less storage is required and any inverse transformation is less complex. However, the block length affects the ℓ

_{1}pseudo-sparsity and thus the CS reconstruction.

_{e}and regularization parameter μ. One can see that the best achievable performance for the lowest sampling ratio is B ∼ O(10

^{2}) with ν ∼ 0.2. Since the DCT is most efficient with a block length that is a power of 2, we choose B = 256.

### 4.3 Adaptive Sampling

_{r}= min(σ

_{e}∕σ

_{max}, 1), where σ

_{max}is a predefined error hyperparameter that can vary depending on the application. 𝜖

_{r}is then compared to an error threshold β in order to decide on the incoming signal quality.

## 5 Neural Networks

### 5.1 Introduction

### 5.2 Network Design and Optimization

### 5.3 Adaptive Sampling

## 6 Numerical Results

### 6.1 Simulation Results

_{e}, each block is split randomly into random-sized “chunks” where each “chunk” has a different noise level that can range from very high to very low. Although the noise adaptive sampling has a set target sampling ratio ν (as can be seen in Algorithm 1), the dependence on the noise level means that the sampling ratio can vary significantly. Therefore, in order to accurately gauge the gain in performance attained by using the noise adaptive sampling, we have compared it to the regular Bernoulli sampling with multiple different sampling ratios. A simulation with adaptive sampling parameters δ = 0.1, β = 0.5, and σ

_{max}= 50 m, and with noise chunks ranging from 0 m to 100 m was run. It showed an improvement of the performance of the reconstruction from the adaptive sampling by about 50% as compared to a purely random sampling that had the same target sampling ratio (ν = 0.2) and by about 25% as compared to a random sampling that had the same effective sampling ratio (ν ∼ 0.4).