Skip to main content
Erschienen in: EURO Journal on Transportation and Logistics 5/2019

13.08.2018 | Research Paper

Unsupervised prototype reduction for data exploration and an application to air traffic management initiatives

verfasst von: Alexander Estes, David J. Lovell, Michael O. Ball

Erschienen in: EURO Journal on Transportation and Logistics | Ausgabe 5/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We discuss a new approach to unsupervised learning and data exploration that involves summarizing a large data set using a small set of “representative” elements. These representatives may be presented to a user in order to provide intuition regarding the distribution of observations. Alternatively, these representatives can be used as cases for more detailed analysis. We call the problem of selecting the representatives the unsupervised prototype reduction problem. We discuss the KC-UPR method for this problem and compare it to other existing methods that may be applied to this problem. We propose a new type of distance measure that allows for more interpretable presentation of results from the KC-UPR method. We demonstrate how solutions from the unsupervised prototype reduction problem may be used to provide decision support for the planning of air traffic management initiatives, and we produce computational results that compare the effectiveness of several methods in this application. We also provide an example of how the KC-UPR method can be used for data exploration, using data from air traffic management initiatives at Newark Liberty International Airport.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Ball M, Swaroop P, Barnhart C, Yan C, Hansen M, Kang L, Liu Y, Vaze V (2015) Service level expectation setting for air traffic flow management: Practical challenges and benefits assessment. In: Proceedings of the 12th USA/Europe air traffic management R&D seminar Ball M, Swaroop P, Barnhart C, Yan C, Hansen M, Kang L, Liu Y, Vaze V (2015) Service level expectation setting for air traffic flow management: Practical challenges and benefits assessment. In: Proceedings of the 12th USA/Europe air traffic management R&D seminar
Zurück zum Zitat Ball MO, Lulli G (2004) Ground delay programs: optimizing over the included flight set based on distance. Air Traffic Control Q 12(1):1–25 Ball MO, Lulli G (2004) Ground delay programs: optimizing over the included flight set based on distance. Air Traffic Control Q 12(1):1–25
Zurück zum Zitat Ball MO, Hoffman R, Odoni AR, Rifkin R (2003) A stochastic integer program with dual network structure and its application to the ground-holding problem. Oper Res 51(1):167–171 Ball MO, Hoffman R, Odoni AR, Rifkin R (2003) A stochastic integer program with dual network structure and its application to the ground-holding problem. Oper Res 51(1):167–171
Zurück zum Zitat Ball MO, Hoffman R, Mukherjee A (2010) Ground delay program planning under uncertainty based on the ration-by-distance principle. Transp Sci 44(1):1–14 Ball MO, Hoffman R, Mukherjee A (2010) Ground delay program planning under uncertainty based on the ration-by-distance principle. Transp Sci 44(1):1–14
Zurück zum Zitat Bertsimas D, Lulli G, Odoni A (2011) An integer optimization approach to large-scale air traffic flow management. Oper Res 59(1):211–227 Bertsimas D, Lulli G, Odoni A (2011) An integer optimization approach to large-scale air traffic flow management. Oper Res 59(1):211–227
Zurück zum Zitat Buehler R, Griffin D, Ross M (1994) Exploring the “planning fallacy”: why people underestimate their task completion times. J Pers Soc Psychol 67(3):366 Buehler R, Griffin D, Ross M (1994) Exploring the “planning fallacy”: why people underestimate their task completion times. J Pers Soc Psychol 67(3):366
Zurück zum Zitat Burges CJC (2010) Dimension reduction: a guided tour. Found Trends Mach Learn 2(4):275–365 Burges CJC (2010) Dimension reduction: a guided tour. Found Trends Mach Learn 2(4):275–365
Zurück zum Zitat Burman P, Polonik W (2009) Multivariate mode hunting: data analytic tools with measures of significance. J Multivar Anal 100(6):1198–1218 Burman P, Polonik W (2009) Multivariate mode hunting: data analytic tools with measures of significance. J Multivar Anal 100(6):1198–1218
Zurück zum Zitat Caruso C, Colorni A, Aloi L (2003) Dominant, an algorithm for the p-center problem. Eur J Oper Res 149(1):53–64 Caruso C, Colorni A, Aloi L (2003) Dominant, an algorithm for the p-center problem. Eur J Oper Res 149(1):53–64
Zurück zum Zitat Chen D, Chen R (2009) New relaxation-based algorithms for the optimal solution of the continuous and discrete p-center problems. Comput Oper Res 36(5):1646–1655 Chen D, Chen R (2009) New relaxation-based algorithms for the optimal solution of the continuous and discrete p-center problems. Comput Oper Res 36(5):1646–1655
Zurück zum Zitat Chen WR, Yun YH, Wen M, Lu HM, Zhang ZM, Liang YZ (2016) Representative subset selection and outlier detection via isolation forest. Anal Methods 8(39):7225–7231 Chen WR, Yun YH, Wen M, Lu HM, Zhang ZM, Liang YZ (2016) Representative subset selection and outlier detection via isolation forest. Anal Methods 8(39):7225–7231
Zurück zum Zitat Cheng F, Gulding J, Baszczewski B, Galaviz R (2011) An optimization model for sample day selection in NAS-wide modeling studies. In: Integrated communications, navigation and surveilance conference (ICNS), 2011. IEEE, p L6-1 Cheng F, Gulding J, Baszczewski B, Galaviz R (2011) An optimization model for sample day selection in NAS-wide modeling studies. In: Integrated communications, navigation and surveilance conference (ICNS), 2011. IEEE, p L6-1
Zurück zum Zitat Clark BN, Colbourn CJ, Johnson DS (1990) Unit disk graphs. Discrete Mathe 86(1–3):165–177 Clark BN, Colbourn CJ, Johnson DS (1990) Unit disk graphs. Discrete Mathe 86(1–3):165–177
Zurück zum Zitat Clark RD (1997) OptiSim: an extended dissimilarity selection method for finding diverse representative subsets. J Chem Inf Comput Sci 37(6):1181–1188 Clark RD (1997) OptiSim: an extended dissimilarity selection method for finding diverse representative subsets. J Chem Inf Comput Sci 37(6):1181–1188
Zurück zum Zitat Cook LS, Wood B (2010) A model for determining ground delay program parameters using a probabilistic forecast of stratus clearing. Air Traffic Control Q 18(1):85 Cook LS, Wood B (2010) A model for determining ground delay program parameters using a probabilistic forecast of stratus clearing. Air Traffic Control Q 18(1):85
Zurück zum Zitat Daszykowski M, Walczak B, Massart D (2002) Representative subset selection. Anal Chim Acta 468(1):91–103 Daszykowski M, Walczak B, Massart D (2002) Representative subset selection. Anal Chim Acta 468(1):91–103
Zurück zum Zitat Davidović T, Ramljak D, Šelmić M, Teodorović D (2011) Bee colony optimization for the p-center problem. Comput Oper Res 38(10):1367–1376 Davidović T, Ramljak D, Šelmić M, Teodorović D (2011) Bee colony optimization for the p-center problem. Comput Oper Res 38(10):1367–1376
Zurück zum Zitat De Sisternes Jimenez F, Webster MD (2013) Optimal selection of sample weeks for approximating the net load in generation planning problems. Technical report ESD-WP-2013-03, Massachusetts Institute of Technology. Engineering Systems Division De Sisternes Jimenez F, Webster MD (2013) Optimal selection of sample weeks for approximating the net load in generation planning problems. Technical report ESD-WP-2013-03, Massachusetts Institute of Technology. Engineering Systems Division
Zurück zum Zitat Delgado L, Prats X, Sridhar B (2013) Cruise speed reduction for ground delay programs: a case study for San Francisco International Airport arrivals. Transp Res Part C Emerg Technol 36:83–96 Delgado L, Prats X, Sridhar B (2013) Cruise speed reduction for ground delay programs: a case study for San Francisco International Airport arrivals. Transp Res Part C Emerg Technol 36:83–96
Zurück zum Zitat Elloumi S, Labbé M, Pochet Y (2004) A new formulation and resolution method for the p-center problem. INFORMS J Comput 16(1):84–94 Elloumi S, Labbé M, Pochet Y (2004) A new formulation and resolution method for the p-center problem. INFORMS J Comput 16(1):84–94
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedinsg of the 2nd international conference on knowledge discovery and data mining, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedinsg of the 2nd international conference on knowledge discovery and data mining, pp 226–231
Zurück zum Zitat Estes A, Ball MO, Lovell D (2017a) Predicting performance of ground delay programs. In: 12th USA/Europe air traffic management R&D seminar, Seattle, WA Estes A, Ball MO, Lovell D (2017a) Predicting performance of ground delay programs. In: 12th USA/Europe air traffic management R&D seminar, Seattle, WA
Zurück zum Zitat Estes A, Lovell D, Ball M (2017b) Data exploration with selection of representative regions: formulation, axioms, methods, and consistency. Technical report, available at SSRN: https://ssrn.com/abstract=3005997. Accessed 9 Aug 2018 Estes A, Lovell D, Ball M (2017b) Data exploration with selection of representative regions: formulation, axioms, methods, and consistency. Technical report, available at SSRN: https://​ssrn.​com/​abstract=​3005997. Accessed 9 Aug 2018
Zurück zum Zitat Flathers B, Fronzak M, Huberdeau M, McKnight C, Wang M, Wilhelm G (2013) A framework for the development of the ATM-weather integration concept. Technical report, The MITRE Corporation Flathers B, Fronzak M, Huberdeau M, McKnight C, Wang M, Wilhelm G (2013) A framework for the development of the ATM-weather integration concept. Technical report, The MITRE Corporation
Zurück zum Zitat Francois D, Wertz V, Verleysen M (2007) The concentration of fractional distances. IEEE Trans Knowl Data Eng 19(7):873–886 Francois D, Wertz V, Verleysen M (2007) The concentration of fractional distances. IEEE Trans Knowl Data Eng 19(7):873–886
Zurück zum Zitat Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9(2):123–143 Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9(2):123–143
Zurück zum Zitat Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435 Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435
Zurück zum Zitat Gonzalez TF (1985) Clustering to minimize the maximum intercluster distance. Theor Comput Sci 38:293–306 Gonzalez TF (1985) Clustering to minimize the maximum intercluster distance. Theor Comput Sci 38:293–306
Zurück zum Zitat Gorripaty S, Hansen M, Pozdnukhov A (2016) Decision support framework to assist air traffic management. In: IEEE/AIAA 35th digital avionics systems conference (DASC), 2016. IEEE, pp 1–6 Gorripaty S, Hansen M, Pozdnukhov A (2016) Decision support framework to assist air traffic management. In: IEEE/AIAA 35th digital avionics systems conference (DASC), 2016. IEEE, pp 1–6
Zurück zum Zitat Grabbe S, Sridhar B, Mukherjee A (2013) Similar days in the NAS: an airport perspective. In: AIAA Aviation technology, integration, and operations conference Grabbe S, Sridhar B, Mukherjee A (2013) Similar days in the NAS: an airport perspective. In: AIAA Aviation technology, integration, and operations conference
Zurück zum Zitat Hassin R, Levin A, Morad D (2003) Lexicographic local search and the p-center problem. Eur J Oper Res 151(2):265–279 Hassin R, Levin A, Morad D (2003) Lexicographic local search and the p-center problem. Eur J Oper Res 151(2):265–279
Zurück zum Zitat Haynes TW, Hedetniemi S, Slater P (1998) Fundamentals of domination in graphs. CRC Press, Boca Raton Haynes TW, Hedetniemi S, Slater P (1998) Fundamentals of domination in graphs. CRC Press, Boca Raton
Zurück zum Zitat Hedetniemi ST, Laskar RC (1990) Bibliography on domination in graphs and some basic definitions of domination parameters. Discrete Math 86(1):257–277 Hedetniemi ST, Laskar RC (1990) Bibliography on domination in graphs and some basic definitions of domination parameters. Discrete Math 86(1):257–277
Zurück zum Zitat Hochbaum DS (1984) When are NP-hard location problems easy? Ann Oper Res 1(3):201–214 Hochbaum DS (1984) When are NP-hard location problems easy? Ann Oper Res 1(3):201–214
Zurück zum Zitat Hochbaum DS, Shmoys DB (1985) A best possible heuristic for the k-center problem. Math Oper Res 10(2):180–184 Hochbaum DS, Shmoys DB (1985) A best possible heuristic for the k-center problem. Math Oper Res 10(2):180–184
Zurück zum Zitat Huo X, Ni XS, Smith AK (2008) A survey of manifold-based learning methods. In: Recent advances in data mining of enterprise data. World Scientific, pp 691–745 Huo X, Ni XS, Smith AK (2008) A survey of manifold-based learning methods. In: Recent advances in data mining of enterprise data. World Scientific, pp 691–745
Zurück zum Zitat Ilhan T, Ozsoy F, Pinar M (2002) An efficient exact algorithm for the vertex p-center problem and computational experiments for different set covering subproblems. Technical report, Bilkent University, Department of Industrial Engineering Ilhan T, Ozsoy F, Pinar M (2002) An efficient exact algorithm for the vertex p-center problem and computational experiments for different set covering subproblems. Technical report, Bilkent University, Department of Industrial Engineering
Zurück zum Zitat Inselberg A (1985) The plane with parallel coordinates. Vis Comput 1(2):69–91 Inselberg A (1985) The plane with parallel coordinates. Vis Comput 1(2):69–91
Zurück zum Zitat Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323 Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323
Zurück zum Zitat Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, San Francisco Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, San Francisco
Zurück zum Zitat Kahneman D, Tversky A (1996) On the reality of cognitive illusions. Psychol Rev 103(3):582–591 Kahneman D, Tversky A (1996) On the reality of cognitive illusions. Psychol Rev 103(3):582–591
Zurück zum Zitat Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148 Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148
Zurück zum Zitat Kotnyek B, Richetta O (2006) Equitable models for the stochastic ground-holding problem under collaborative decision making. Transp Sci 40(2):133–146 Kotnyek B, Richetta O (2006) Equitable models for the stochastic ground-holding problem under collaborative decision making. Transp Sci 40(2):133–146
Zurück zum Zitat Kuhn KD (2016) A methodology for identifying similar days in air traffic flow management initiative planning. Transp Res Part C Emerg Technol 69:1–15 Kuhn KD (2016) A methodology for identifying similar days in air traffic flow management initiative planning. Transp Res Part C Emerg Technol 69:1–15
Zurück zum Zitat Lee WY, Goodwin P, Fildes R, Nikolopoulos K, Lawrence M (2007) Providing support for the use of analogies in demand forecasting tasks. Int J Forecast 23(3):377–390 Lee WY, Goodwin P, Fildes R, Nikolopoulos K, Lawrence M (2007) Providing support for the use of analogies in demand forecasting tasks. Int J Forecast 23(3):377–390
Zurück zum Zitat Liu FT, Ting KM, Zhou ZH (2008a) Isolation forest. In: Eighth IEEE international conference on data mining, 2008. IEEE, pp 413–422 Liu FT, Ting KM, Zhou ZH (2008a) Isolation forest. In: Eighth IEEE international conference on data mining, 2008. IEEE, pp 413–422
Zurück zum Zitat PcB Liu, Hansen M, Mukherjee A (2008b) Scenario-based air traffic flow management: from theory to practice. Transp Res Part B Methodol 42(7):685–702 PcB Liu, Hansen M, Mukherjee A (2008b) Scenario-based air traffic flow management: from theory to practice. Transp Res Part B Methodol 42(7):685–702
Zurück zum Zitat Liu Y, Hansen M (2013) Evaluation of the performance of ground delay programs. Transp Res Rec J Transp Res Board 2400:54–64 Liu Y, Hansen M (2013) Evaluation of the performance of ground delay programs. Transp Res Rec J Transp Res Board 2400:54–64
Zurück zum Zitat Lovallo D, Clarke C, Camerer C (2012) Robust analogizing and the outside view: two empirical tests of case-based decision making. Strateg Manag J 33(5):496–512 Lovallo D, Clarke C, Camerer C (2012) Robust analogizing and the outside view: two empirical tests of case-based decision making. Strateg Manag J 33(5):496–512
Zurück zum Zitat Mihelič J, Robič B (2003) Approximation algorithms for the k-center problem: an experimental evaluation. In: Operations research Proceedings 2002. Springer, pp 371–376 Mihelič J, Robič B (2003) Approximation algorithms for the k-center problem: an experimental evaluation. In: Operations research Proceedings 2002. Springer, pp 371–376
Zurück zum Zitat Minieka E (1970) The m-center problem. SIAM Rev 12(1):138–139 Minieka E (1970) The m-center problem. SIAM Rev 12(1):138–139
Zurück zum Zitat Mladenović N, Labbé M, Hansen P (2003) Solving the p-center problem with tabu search and variable neighborhood search. Networks 42(1):48–64 Mladenović N, Labbé M, Hansen P (2003) Solving the p-center problem with tabu search and variable neighborhood search. Networks 42(1):48–64
Zurück zum Zitat Mukherjee A, Hansen M (2007) A dynamic stochastic model for the single airport ground holding problem. Transp Sci 41(4):444–456 Mukherjee A, Hansen M (2007) A dynamic stochastic model for the single airport ground holding problem. Transp Sci 41(4):444–456
Zurück zum Zitat Olvera-López JA, Carrasco-Ochoa JA, Martínez-Trinidad JF, Kittler J (2010) A review of instance selection methods. Artif Intell Rev 34(2):133–143 Olvera-López JA, Carrasco-Ochoa JA, Martínez-Trinidad JF, Kittler J (2010) A review of instance selection methods. Artif Intell Rev 34(2):133–143
Zurück zum Zitat Park HS, Jun CH (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341 Park HS, Jun CH (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Zurück zum Zitat Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076 Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076
Zurück zum Zitat Penny S, Lewis T, Hoffman B, White T, Krozel J (2009) Cluster analysis for the annualization of ACES simulated NAS metrics. Technical report, Metron Aviation Penny S, Lewis T, Hoffman B, White T, Krozel J (2009) Cluster analysis for the annualization of ACES simulated NAS metrics. Technical report, Metron Aviation
Zurück zum Zitat Robič B, Mihelič J (2005) Solving the k-center problem efficiently with a dominating set algorithm. CIT J Comput Inf Technol 13(3):225–234 Robič B, Mihelič J (2005) Solving the k-center problem efficiently with a dominating set algorithm. CIT J Comput Inf Technol 13(3):225–234
Zurück zum Zitat Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27(3):832–837 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27(3):832–837
Zurück zum Zitat Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New York Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New York
Zurück zum Zitat Swaroop P, Ball M (2013) Consensus-building mechanism for setting service expectations in air traffic flow management. Transp Res Rec J Transp Res Board 2325:87–96 Swaroop P, Ball M (2013) Consensus-building mechanism for setting service expectations in air traffic flow management. Transp Res Rec J Transp Res Board 2325:87–96
Zurück zum Zitat Tan P, Steinbach M, Kumar V (2014) Introduction to data mining. Pearson Education Limited, Harlow Tan P, Steinbach M, Kumar V (2014) Introduction to data mining. Pearson Education Limited, Harlow
Zurück zum Zitat Triguero I, Derrac J, Garcia S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(1):86–100 Triguero I, Derrac J, Garcia S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(1):86–100
Zurück zum Zitat Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185(4157):1124–1131 Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185(4157):1124–1131
Zurück zum Zitat Van Dongen S (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl 30(1):121–141 Van Dongen S (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl 30(1):121–141
Metadaten
Titel
Unsupervised prototype reduction for data exploration and an application to air traffic management initiatives
verfasst von
Alexander Estes
David J. Lovell
Michael O. Ball
Publikationsdatum
13.08.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
EURO Journal on Transportation and Logistics / Ausgabe 5/2019
Print ISSN: 2192-4376
Elektronische ISSN: 2192-4384
DOI
https://doi.org/10.1007/s13676-018-0132-0

Weitere Artikel der Ausgabe 5/2019

EURO Journal on Transportation and Logistics 5/2019 Zur Ausgabe