1 Introduction
2 Related work
3 Location and radio information application
3.1 UE radio measurements
-
Incomplete neighbor cell list: most of the time, the UE served by a cell might only report the cells included in the serving neighbor cell list. That list might be incomplete, especially in UMTS. In that technology, the neighbor list is typically manually configured. Also the serving cell may require to directly receive power from each neighbor in order to keep it in the list.
-
Unavailability to obtain neighbor cell received power from the monitoring application: if the application used to obtain the RSS values is a user-level app (e.g., Android or iOS), the majority of commercial terminals only report the values for the serving cell [16].
-
Randomness on the neighboring cell UE reporting period: even at control and radio-link monitoring level, the neighboring cells are often measured with reduced periodicity or only associated with certain events.
3.2 Location-based measurements processing
-
Location-based RSS_mean, \( {\overline{RSS}}_{AOI}^{cel{l}_i}\left(cel{l}_j,t\right), \) defined as:where M j [t] is the set of RSS values m RSS[s] measured from the serving cell (or set of cells) cell j during the period t by the UEs located in the AOI of cell i . |M j [t]| indicates the number of values in the set. \( {w}_{\mathrm{AOI}}^{{\mathrm{cell}}_i}\left({\gamma}_{xyz}\left[s\right]\right) \) represents the individual weight applied to one measurement m RSS[s], depending on the coordinates where the sample was measured γ xyz [s]. Finally, E W is the sum of all the applied weights, \( {E}_{\mathrm{W}}={\displaystyle {\sum}_{s=1}^{\left|{M}_j\left[t\right]\right|}{w}_{\mathrm{AOI}}^{{\mathrm{cell}}_i}\left({\gamma}_{xyz}\left[s\right]\right)} \).$$ {\overline{RSS}}_{\mathrm{AOI}}^{{\mathrm{cell}}_i}\left({\mathrm{cell}}_j,t\right)=\frac{1}{E_W}{\displaystyle \sum_{s=1}^{\left|{M}_j\left[t\right]\right|}}{w}_{\mathrm{AOI}}^{{\mathrm{cell}}_i}\left({\gamma}_{xyz}\left[s\right]\right){m}_{\mathrm{RSS}}\left[s\right], $$(1)The serving cell j can be equal to cell i or a different cell of the scenario. When different, it means that the generated indicator includes information on the UEs served by cell j but located in an AOI of cell i , as presented in Fig. 1. This is one of the main characteristics of the proposed approach, as it allows the monitoring of a possible faulty cell i by its neighbors. In this way, the detection of the sleeping cell is done from the measurements obtained from its neighboring cells served UEs.
-
Location-based RSS_5th percentile, calculated as the RSS value below which the 5 % of the lowest collected RSS values are. For non-location approaches, this is a common indicator of the values gathered in the edges and/or at far distance of the cell as well as from low covered (shadow) spots. If a cell is in outage, the classic RSS_5th percentile of their neighboring cells would especially reflect the RSS received by the UEs more poorly served in the area originally covered by the faulty cell.
3.3 Sample weights and AOIs for the detection of sleeping cells
-
Expected coverage area (ECov) of the possible sleeping cell. The UEs in this area are most likely to be impacted by the cell issue. However, depending on the range of overlapping between cells, this might be compensated by the coverage coming from neighbors BSs
-
Expected center area (ECent) of the cell refers to locations in the core of the coverage area of a cell. In these, the signal of the cell is clearly predominant in respect to its neighbors.
3.4 AOIs calculation considerations
AOI calculation approach | Required network information | Required localization/scenario information |
---|---|---|
Classic (no location/AOIs) | UE measurements | – |
Test campaign | UE measurements, radiomap database | UE positions, positions during radiomap gathering |
Site-specific detailed propagation | UE measurements | UE positions, BSs positions, transmitted power, obstacles, walls, propagation conditions |
Simple log-distance path loss | UE measurements | UE positions, BSs positions, transmitted power |
4 Detection algorithm
4.1 Training phase
-
ON/OFF calibration period: by a simple procedure of disconnecting and connecting the cells in an alternative manner (which can be performed automatically), the system can obtain the needed sleeping cell training set. These may however alter the normal operation of the network.
-
Neighboring cells measurements analysis: if the terminals are able to measure and report RSS values from neighboring cells, this information can be used to approximate their expected serving cell values if one cell is disconnected. This process would have the advantage of not disrupting the cell service provision, and it could be performed continuously.
4.2 Online phase
4.3 Confidence level definition
5 Diagnosis of sleeping cell causes
No. | Failure | Description | Indicator | ||
---|---|---|---|---|---|
UE RSS values | Small cell NETACC | Backhaul NETACC (general access to the backhaul) | |||
1 | Small cell booting process | Small cell starting logon procedure into the operator’s network fails. The small cell stays active but without transmitting. | Affected | Unaffected | Unaffected |
2 | Small cell disconnection | Small cell is disconnected from power supply and/or backhaul connection. It stops transmitting. | Affected | Affected | Unaffected |
3 | Backhaul NETACC | Failure of the backhaul connection of the network. | Affected | Affected | Affected |
4 | Checking entity NETACC | The entity checking the cells is not able to connect with them. | Unaffected | Affected | Unaffected |
5 | SON system NETACC | The SON system is unable to connect specifically with the router due to wrong IP config. | Unaffected | Affected | Affected |
6 Distributed Self-healing scheme
6.1 Distributed scheme
-
In order to define the cells likely to be affected by a failure in a particular cell i , the approach is to automatically include in \( {\mathbf{cells}}_{\boldsymbol{i}}^{\mathbf{imp}} \) the cell i itself and its adjacent neighbors. The adjacent cells can be defined from the estimated coverage area maps, selecting the BSs whose coverage areas are in contact. \( {\mathbf{cells}}_{\boldsymbol{i}}^{\mathbf{imp}} \) set can also be updated based on the neighbor cell list of each cell, as they are automatically updated during network operation [31]. All the cells in the deployment should have knowledge of the different \( {\mathbf{cells}}_{\boldsymbol{i}}^{\mathbf{imp}} \) sets and their relative position in order to participate in the detection of problems of those sets where they are part of.Based on \( \mathbf{cell}{\mathbf{s}}_{\boldsymbol{i}}^{\mathbf{imp}} \) and their relative positions, the AOIs of cell i can be calculated (in a centralized or distributed way) and stored.
-
As described in Section 4, cell j participating in the detection of a failure in cell i needs to be in possession of the prior likelihood of each status and the conditional PDFs: \( {\mathrm{pdf}}_{ij}^{\mathrm{AOI},\;{\mathrm{Normal}}_i}\left({r}_{ij}\right) \) and \( {\mathrm{pdf}}_{ij}^{\mathrm{AOI},\kern0.5em {\mathrm{Sleeping}}_i}\left({r}_{ij}\right) \) (where its previously presented nomenclature has been particularized for the indicator \( {F}_{ij}^{AOI} \)). During the training phase, the PDFs can be constructed and stored directly by cell j as \( {F}_{ij}^{\mathrm{AOI}} \) is locally generated by the BS. From these, the \( \Psi \left({\mathrm{F}}_{ij}^{\mathrm{AOI}}\right) \) parameter can be also calculated and stored. The prior likelihoods are assigned with a default or configured value.
-
During the operational life of the network, the process is divided in different stages of computation and information sharing between the cells.
-
This stage is described in Fig. 2. Firstly, cell j gathers the RSS samples reported by its served UEs. Secondly, the localization associated with each measurement is obtained directly from localization sources that can be the UEs themselves, a cellular-based positioning system or an external localization service [9]. Thirdly, with this information and the stored AOIs, the values of the \( {w}_{\mathrm{AOI}}^{{\mathrm{cell}}_i} \) are calculated for all the RSS samples and then the location-based indicator value \( {F}_{ij}^{\mathrm{AOI}}={f}_{ij}^{\mathrm{AOI}}\left[t\right] \) is obtained.×Fourthly, based on this and the conditional PDFs, the likelihoods of the current value given the status of cell i , \( \widehat{p}\left({F}_{ij}^{\mathrm{AOI}}={f}_{ij}^{\mathrm{AOI}}\left[t\right]\ \left|\mathrm{Normal}\right.\right) \) and \( \widehat{p}\left({F}_{ij}^{\mathrm{AOI}}={f}_{ij}^{\mathrm{AOI}}\left[t\right]\ \left|\mathrm{Sleeping}\right.\right) \) are calculated as well as \( {\varphi}_{ij}^{\mathrm{AOI}}\left[t\right] \) (see Eq. (13)). Also, the number of samples inside the AOI and used to generate the indicator are propagated to the following distribution stage.
-
Afterwards, each cell of \( {\mathrm{cells}}_i^{\mathrm{imp}} \) shares their estimated conditional probabilities with the rest of the cells of the set. This and the next stages are presented in Fig. 3.×The message from a cell might be not received due to incorrect timing, connection losses, or failure in the cell, which cannot be considered a univocal consequence of a sleeping cell failure (as described in Section 5). Therefore, the conditional probabilities for the indicator of that cell are assumed “1” for both status, which means that such input is not considered in the classifier.
-
Having the conditional probabilities from the other cells of \( {\mathrm{cells}}_i^{\mathrm{imp}} \), the status of cell i can be calculated by any of them based on the naive Bayes classifier detection rule presented in Eq. (15), providing its estimated status.
-
If a neighboring cell is detected as sleeping, any other BS can check its NETACC in order to determine the particular cause behind the problem. This, together with the estimated status is used to specify the particular cause by means of binary logic following Table 2.
-
If the information has been properly received and computed by each BS, the results in terms of the estimated status shall be equivalent to all of them. However, this may not be the case if any distributed message is lost. Also, if the NETACC check provides different results (e.g., due to congestion) for different BSs.Therefore, consensus techniques can be applied for the results achieved by all of the BSs. In order to achieve a common diagnosis from the possible different results of each node, multiple mechanisms have been developed for the general field of distributed computation. For example, the selection of a master/coordinator cell dedicated to perform the final posterior probability calculations and then share them with the other cells can keep consensus as well as reduce the computational costs by freeing some of the BSs from the need of performing the classification [31]. However, the system becomes then more vulnerable to failures in such master cell. To avoid so, the use of strong consistency as presented in [34] is recommended. This is based on making the independent diagnosis performed by each cell consistent by sharing and checking their mutual results.
-
Once the cell status is detected and diagnosed, compensation and recovery mechanisms can be triggered. For instance, readjusting cell powers to compensate a neighboring sleeping cell, rebooting automatically themselves if found faulty or alert the operator’s OAM system about the issue.
6.2 Implementation
7 Evaluation
Propagation model | Indoor-indoor | Winner II A1 |
Indoor-outdoor | Winner II A2 | |
Outdoor-outdoor | Winner II C2 | |
Outdoor-indoor | Winner II C4 | |
Base station model | EIRP | 3 (small cells)/43 (macro) dBm |
Directivity | Omni (small)/tri-sector (macro) | |
Access | Open (small)/open (macro) | |
Mobile station model | Noise figure | 9 dB |
Noise density | −174dBm/Hz | |
Traffic model | Calls | Poisson (avg. 0.43calls/user·h) |
Duration | Exponential (avg. 100 s) | |
Mobility model | Outdoor | 3 km/h, random direction, and wrap-around |
Indoor | Random waypoint | |
Service model | Voice over IP | 16 kbps |
Full buffer | ||
RRM model | Bandwidth | 1.4 MHz (6 PRBs) |
Access control | Directed retry (threshold = −44dBm) | |
Cell reselection | Criteria S, R | |
Handover | Events A3, A5 | |
Scheduler | Voice: round-robin best channel | |
Full buffer: proportional fair | ||
Time resolution | 100 ms | |
Load balancing algorithm | Epoch time | 60 s |
7.1 Impact of sleeping cell case in classic performance indicators
7.2 RSS indicators and AOIs
7.2.1 Detection performance
Cell | Cell 9 | Cell 10 | Cell 11 | Cell 12 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Indicator type\figure of merit | FN | FA | IN | FN | FA | IN | FN | FA | IN | FN | FA | IN |
Non-location local | 69.0 | 25.6 | 0.0 | 13.5 | 35.2 | 0.0 | 17.5 | 44.7 | 0.0 | 39.5 | 19.1 | 0.0 |
Non-location centralised | 46.0 | 45.7 | 0.0 | 19.0 | 29.1 | 0.0 | 33.5 | 28.1 | 0.0 | 48.5 | 9.5 | 0.0 |
Non-location distributed | 22.0 | 0.0 | 0.0 | 14.0 | 0.0 | 0.0 | 21.0 | 0.0 | 0.0 | 35.0 | 0.0 | 0.0 |
Centralised ECov | 13.0 | 15.1 | 10.5 | 3.5 | 0.5 | 2.8 | 31.5 | 37.7 | 0.0 | 35.0 | 21.1 | 0.0 |
Centralised ECent | 7.0 | 0.0 | 40.1 | 2.5 | 0.5 | 17.8 | 21.5 | 0.5 | 3.3 | 63.5 | 0.0 | 4.3 |
Distributed Ecov | 3.5 | 0.5 | 10.8 | 0.0 | 0.0 | 2.8 | 21.5 | 0.0 | 0.0 | 57.0 | 0.0 | 0.8 |
Distributed Ecent | 0.0 | 0.0 | 40.6 | 0.0 | 0.0 | 18.0 | 1.0 | 0.0 | 3.5 | 53.5 | 0.0 | 5.0 |