1 Introduction
1.1 Problem description/objective
1.2 Research contribution
-
Since there is no definitive guideline for minimum dataset size for deep learning classification problems, this research aims to determine a baseline for aircraft prediction models.
-
This paper determines the baseline features to identify an aircraft with kinematic data: Speed, barometric pressure, and vertical speed
-
This paper analyzes and reiterates the importance of selecting appropriate features. The ‘noise’ feature within this dataset severely limited the classification power of the network.
1.3 Organization
2 Background and literature review
Paper | ADS-B | Prediction & classification | Improve aircraft operations | Transmission security | Dataset size | Collection & processing |
---|---|---|---|---|---|---|
Ginoulhac et al. (2019) [7] | \(\checkmark\) | \(\checkmark\) | ||||
Qian et al. (2019) [8] | \(\checkmark\) | \(\checkmark\) | ||||
Kumar et al. (2021) [11] | \(\checkmark\) | \(\checkmark\) | ||||
Basrawi (2021) [19] | \(\checkmark\) | \(\checkmark\) | ||||
Basrawi et al. (2021) [23] | \(\checkmark\) | \(\checkmark\) | ||||
Sun et al. (2016) [24] | \(\checkmark\) | \(\checkmark\) | ||||
Sun et al. (2017) [25] | \(\checkmark\) | \(\checkmark\) | ||||
Ruseno et al. (2022) [26] | \(\checkmark\) | \(\checkmark\) | ||||
Filippone et al. (2021) [27] | \(\checkmark\) | \(\checkmark\) | ||||
Hasin et al. (2021) [28] | \(\checkmark\) | \(\checkmark\) | ||||
Pearce et al. (2021) [29] | \(\checkmark\) | \(\checkmark\) | ||||
Sun (2021) [30] | \(\checkmark\) | \(\checkmark\) | ||||
Karim et al. (2017) [31] | \(\checkmark\) | |||||
Karim et al. (2019) [32] | \(\checkmark\) | |||||
Goodfellow et al. (2016) [33] | \(\checkmark\) | |||||
Hu et al. (2018) [34] | \(\checkmark\) | |||||
Alwosheel et al. (2018) [35] | \(\checkmark\) | |||||
Cho et al. (2015) [36] | \(\checkmark\) | |||||
Jain et al. (1982) [37] | \(\checkmark\) | |||||
Baum et al. (1988) [38] | \(\checkmark\) | |||||
Haykin (2009) [39] | \(\checkmark\) |
2.1 Automatic dependent surveillance-broadcast (ADS-B)
2.2 Multivariate long short-term memory–fully convolutional neural networks
2.3 Dual-stage deep engine classifier
Classifier | Overall accuracy | Jet Accuracy | Turboprop accuracy | Piston engine accuracy |
---|---|---|---|---|
DSDEC (300 time steps) | 89.2% | 98.4% | 79.2% | 89.9% |
SVM (300 time steps) | 77.4% | 85.5% | 72.2% | 74.6% |
SVM (100 time steps) | 68.6% | 81.1% | 56.6% | 68.1% |
RF (300 time steps) | 83.4% | 94.1% | 70.5% | 85.8% |
RF (100 time steps) | 76.7% | 90.0% | 61.5% | 78.6% |
2.4 Size of dataset
2.5 Summary
3 Methodology
-
Experiment 1: The first experiment creates 24 models with two different learning rates, three feature sets, and four data amounts. The models train over a period of 200 epochs and are evaluated on the testing data. The goal of the first experiment is to determine the effect of varying the number of training data observations. Effects to accuracy, loss, precision, and recall are recorded.
-
Experiment 2: In the second experiment we develop models using all possible feature combinations (255). The goal of the second experiment is to determine which subset of features has the highest overall accuracy when predicting engine type. The models train for 50 epochs, which is where the first experiment shows training and validation data accuracy diverge.
Exp # | Goals | # of Models | Parameters |
---|---|---|---|
1 | Determine the effect of varying the amount of training data samples | 24 | Learning rate: 0.01 & 0.001 Drop out: 0.5 Time steps: 300 Feature set: limited, medium & full Optimizer: Adam Data size: Full, 1/2, 1/4, & 1/8 Epochs: 200 |
2 | Determine the effect of each subset of features | 255 | Learning rate: 0.001 Drop out: 0.5 Time steps: 300 Feature set: all possible subsets Optimizer: Adam Data size: Full Epochs: 50 |
3.1 Assumptions and limitations
3.2 Process
3.2.1 Data preparation
Size | Samples | Tensors |
---|---|---|
Full | 1,233,00 | 4,110 |
Half | 616,500 | 2,055 |
Quarter | 307,800 | 1,026 |
Eighth | 153,900 | 513 |
3.2.2 Input feature selection
Feature | Field | Data type | Description |
---|---|---|---|
Altitude | Alt | integer | The altitude in feet at standard pressure |
Ground Altitude | Galt | integer | The altitude adjusted for local air pressure |
Airspeed | Spd | knots (float) | The ground speed in knots |
Barometric Pressure | InHg | float | The air pressure in inches of mercury that was used to calculate the AMSL altitude from the standard pressure altitude |
Vertical Speed | Vsi | integer | Vertical speed in feet per minute |
Time | PosTime | epoch (ms) | The time that the position was last reported by the aircraft. |
Track | Trak | degrees (float) | Aircraft track angle across the ground clockwise from 0 north. |
Latitude | Lat | float | The aircraft’s latitude over the ground |
Longitude | Long | float | The aircraft’s longitude over the ground |
Location | X,Y,Z | float | Cartesian coordinates of the aircraft. This data feature is derived from the lat/long coordinates and it is not originally part of the ADS-B broadcast |
Limited feature set | Medium feature set | Full feature set |
---|---|---|
Altitude Airspeed Vertical speed Location (X,Y,Z) | Altitude Airspeed Vertical speed Location (X,Y,Z) Barometric Pressure Time Track | Altitude Airspeed Vertical speed Location (X,Y,Z) Barometric Pressure Time Track Ground Altitude Lat/Long |
3.2.3 Hyperparameter selection
3.2.4 LSTM training, testing, and evaluation
4 Results and discussion
4.1 Experiment 1–24 models: data size comparison
Num | Learn rate | Feature set | Data amt | Mean accuracy | Mean loss |
---|---|---|---|---|---|
1 | 0.001 | Limited | Full | 0.894 (+/−0.018) | 0.413 (+/−0.069) |
2 | 0.01 | Limited | Full | 0.890 (+/−0.019) | 0.346 (+/−0.083) |
3 | 0.001 | Medium | Full | 0.513 (+/−0.031) | 1.321 (+/−0.092) |
4 | 0.01 | Medium | Full | 0.337 (+/−0.019) | 3.604 (+/−0.117) |
5 | 0.001 | Full | Full | 0.759 (+/−0.038) | 0.675 (+/−0.100) |
6 | 0.01 | Full | Full | 0.356 (+/−0.018) | 2.960 (+/−0.210) |
7 | 0.001 | Limited | Half | 0.848 (+/−0.019) | 0.474 (+/−0.072) |
8 | 0.01 | Limited | Half | 0.865 (+/−0.023) | 0.367 (+/−0.051) |
9 | 0.001 | Medium | Half | 0.492 (+/−0.029) | 1.493 (+/−0.102) |
10 | 0.01 | Medium | Half | 0.490 (+/−0.031) | 1.622 (+/−0.102) |
11 | 0.001 | Full | Half | 0.740 (+/−0.025) | 0.595 (+/−0.053) |
12 | 0.01 | Full | Half | 0.354 (+/−0.023) | 3.071 (+/−0.149) |
13 | 0.001 | Limited | 1/4 | 0.839 (+/−0.016) | 0.473 (+/−0.098) |
14 | 0.01 | Limited | 1/4 | 0.839 (+/−0.014) | 0.478 (+/−0.054) |
15 | 0.001 | Medium | 1/4 | 0.700 (+/−0.026) | 0.789 (+/−0.078) |
16 | 0.01 | Medium | 1/4 | 0.391 (+/−0.040) | 2.211 (+/−0.152) |
17 | 0.001 | Full | 1/4 | 0.658 (+/−0.028) | 0.891 (+/−0.089) |
18 | 0.01 | Full | 1/4 | 0.505 (+/−0.040) | 1.645 (+/−0.121) |
19 | 0.001 | Limited | 1/8 | 0.758 (+/−0.019) | 0.897 (+/−0.124) |
20 | 0.01 | Limited | 1/8 | 0.688 (+/−0.037) | 1.525 (+/−0.224) |
21 | 0.001 | Medium | 1/8 | 0.678 (+/−0.028) | 1.008 (+/−0.088) |
22 | 0.01 | Medium | 1/8 | 0.607 (+/−0.026) | 1.397 (+/−0.175) |
23 | 0.001 | Full | 1/8 | 0.595 (+/−0.038) | 1.833 (+/−0.179) |
24 | 0.01 | Full | 1/8 | 0.484 (+/−0.021) | 2.120 (+/−0.116) |
4.2 Experiment 2–255 models: feature set comparison
Ex | Feature set | Mean accuracy | Mean loss |
---|---|---|---|
56 | Spd,InHg,Vsi | 0.894 (+/−0.013) | 0.304 (+/−0.038) |
40 | Spd,Vsi | 0.887 (+/−0.025) | 0.336 (+/−0.088) |
251 | Alt,Galt,Spd,InHg, Vsi,Trak,location | 0.886 (+/−0.023) | 0.319 (+/−0.056) |
59 | Spd,InHg,Vsi,Trak, location | 0.886 (+/−0.015) | 0.307 (+/−0.060) |
186 | Alt,Spd,InHg,Vsi,Trak | 0.884 (+/−0.017) | 0.330 (+/−0.059) |
171 | Alt,Spd,Vsi,Trak, location | 0.884 (+/−0.016) | 0.366 (+/−0.132) |
43 | Spd,Vsi,Trak,location | 0.884 (+/−0.015) | 0.326 (+/−0.066) |
42 | Spd,Vsi,Trak | 0.877 (+/−0.017) | 0.346 (+/−0.067) |
185 | Alt,Spd,InHg,Vsi, location | 0.876 (+/−0.026) | 0.318 (+/−0.056) |
105 | Galt,Spd,Vsi,location | 0.876 (+/−0.019) | 0.347 (+/−0.052) |
105 | Galt,Spd,Vsi,location | 0.876 (+/−0.019) | 0.347 (+/−0.052) |
184 | Alt,Spd,InHg,Vsi | 0.875 (+/−0.020) | 0.332 (+/−0.070) |
235 | Alt,Galt,Spd,Vsi,Trak, location | 0.875 (+/−0.018) | 0.362 (+/−0.060) |
58 | Spd,InHg,Vsi,Trak | 0.875 (+/−0.015) | 0.360 (+/−0.057) |
248 | Alt,Galt,Spd,InHg,Vsi | 0.874 (+/−0.017) | 0.342 (+/−0.056) |
123 | Galt,Spd,InHg,Vsi,Trak, location | 0.874 (+/−0.015) | 0.370 (+/−0.064) |
169 | Alt,Spd,Vsi,location | 0.872 (+/−0.020) | 0.364 (+/−0.077) |
41 | Spd,Vsi,location | 0.871 (+/−0.023) | 0.334 (+/−0.037) |
104 | Galt,Spd,Vsi | 0.871 (+/−0.020) | 0.369 (+/−0.060) |
114 | Galt,Spd,InHg,Trak | 0.871 (+/−0.017) | 0.368 (+/−0.062) |
106 | Galt,Spd,Vsi,Trak | 0.871 (+/−0.013) | 0.396 (+/−0.069) |
Ex | Feature set | Mean accuracy | Mean loss |
---|---|---|---|
7 | PosTime,Trak,location | 0.337 (+/−0.038) | 1.622 (+/−0.088) |
29 | InHg,Vsi,PosTime, location | 0.337 (+/−0.037) | 4.511 (+/−3.705) |
21 | InHg,PosTime,location | 0.337 (+/−0.028) | 38.038(+/−109.699) |
22 | InHg,PosTime,Trak | 0.336 (+/−0.024) | 1.615 (+/−0.117) |
52 | Spd,InHg,PosTime | 0.334 (+/−0.024) | 2.812 (+/−0.138) |
127 | Galt,Spd,InHg,Vsi, PosTime,Trak,location | 0.334 (+/−0.019) | 18.762 (+/−1.317) |
37 | Spd,PosTime,location | 0.332(+/−0.016) | 5.308(+/−0.238) |
55 | Spd,InHg,PosTime,Trak, location | 0.114 (+/−0.022) | 3.323 (+/−0.116) |
38 | Spd,PosTime,Trak | 0.110 (+/−0.018) | 2.742 (+/−0.072) |
54 | Spd,InHg,PosTime,Trak | 0.074 (+/−0.015) | 4.664 (+/−0.112) |