Skip to main content

2021 | OriginalPaper | Buchkapitel

Scalable Statistical Inference of Photometric Redshift via Data Subsampling

verfasst von : Arindam Fadikar, Stefan M. Wild, Jonas Chaves-Montero

Erschienen in: Computational Science – ICCS 2021

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Handling big data has largely been a major bottleneck in traditional statistical models. Consequently, when accurate point prediction is the primary target, machine learning models are often preferred over their statistical counterparts for bigger problems. But full probabilistic statistical models often outperform other models in quantifying uncertainties associated with model predictions. We develop a data-driven statistical modeling framework that combines the uncertainties from an ensemble of statistical models learned on smaller subsets of data carefully chosen to account for imbalances in the input space. We demonstrate this method on a photometric redshift estimation problem in cosmology, which seeks to infer a distribution of the redshift—the stretching effect in observing the light of far-away galaxies—given multivariate color information observed for an object in the sky. Our proposed method performs balanced partitioning, graph-based data subsampling across the partitions, and training of an ensemble of Gaussian process models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Banerjee, S., Fuentes, M.: Bayesian modeling for large spatial datasets. Wiley Interdiscip. Rev. Comput. Stat. 4(1), 59–66 (2012)CrossRef Banerjee, S., Fuentes, M.: Bayesian modeling for large spatial datasets. Wiley Interdiscip. Rev. Comput. Stat. 4(1), 59–66 (2012)CrossRef
2.
Zurück zum Zitat Baum, W.A.: Photoelectric determinations of redshifts beyond 0.2 c. Astronomical J. 62, 6–7 (1957) Baum, W.A.: Photoelectric determinations of redshifts beyond 0.2 c. Astronomical J. 62, 6–7 (1957)
3.
Zurück zum Zitat Beck, R., Dobos, L., Budavári, T., Szalay, A.S., Csabai, I.: Photometric redshifts for the SDSS Data Release 12. Mon. Not. R. Astron. Soc. 460(2), 1371–1381 (2016)CrossRef Beck, R., Dobos, L., Budavári, T., Szalay, A.S., Csabai, I.: Photometric redshifts for the SDSS Data Release 12. Mon. Not. R. Astron. Soc. 460(2), 1371–1381 (2016)CrossRef
4.
Zurück zum Zitat Benitez, N.: Bayesian photometric redshift estimation. Astrophys. J. 536(2), 571 (2000)CrossRef Benitez, N.: Bayesian photometric redshift estimation. Astrophys. J. 536(2), 571 (2000)CrossRef
6.
Zurück zum Zitat Brahim-Belhouari, S., Bermak, A.: Gaussian process for nonstationary time series prediction. Comput. Stat. Data Anal. 47(4), 705–712 (2004)MathSciNetCrossRef Brahim-Belhouari, S., Bermak, A.: Gaussian process for nonstationary time series prediction. Comput. Stat. Data Anal. 47(4), 705–712 (2004)MathSciNetCrossRef
7.
Zurück zum Zitat Cavuoti, S., Brescia, M., Longo, G., Mercurio, A.: Photometric redshifts with the quasi Newton algorithm (MLPQNA) Results in the PHAT1 contest. Astron. Astrophys. 546, A13 (2012)CrossRef Cavuoti, S., Brescia, M., Longo, G., Mercurio, A.: Photometric redshifts with the quasi Newton algorithm (MLPQNA) Results in the PHAT1 contest. Astron. Astrophys. 546, A13 (2012)CrossRef
8.
Zurück zum Zitat Dietterich, T.G., et al.: Ensemble learning. Handb. Brain Theory Neural Netw. 2, 110–125 (2002) Dietterich, T.G., et al.: Ensemble learning. Handb. Brain Theory Neural Netw. 2, 110–125 (2002)
9.
Zurück zum Zitat D’Isanto, A., Polsterer, K.L.: Photometric redshift estimation via deep learning-generalized and pre-classification-less, image based, fully probabilistic redshifts. Astron. Astrophys. 609, A111 (2018)CrossRef D’Isanto, A., Polsterer, K.L.: Photometric redshift estimation via deep learning-generalized and pre-classification-less, image based, fully probabilistic redshifts. Astron. Astrophys. 609, A111 (2018)CrossRef
10.
Zurück zum Zitat Fernández-Soto, A., Lanzetta, K.M., Yahil, A.: A new catalog of photometric redshifts in the Hubble deep field. Astrophys. J. 513, 34–50 (1999)CrossRef Fernández-Soto, A., Lanzetta, K.M., Yahil, A.: A new catalog of photometric redshifts in the Hubble deep field. Astrophys. J. 513, 34–50 (1999)CrossRef
11.
Zurück zum Zitat Firth, A.E., Lahav, O., Somerville, R.S.: Estimating photometric redshifts with artificial neural networks. Mon. Not. R. Astron. Soc. 339(4), 1195–1202 (2003)CrossRef Firth, A.E., Lahav, O., Somerville, R.S.: Estimating photometric redshifts with artificial neural networks. Mon. Not. R. Astron. Soc. 339(4), 1195–1202 (2003)CrossRef
12.
Zurück zum Zitat Gelfand, A.E., Kottas, A., MacEachern, S.N.: Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Am. Stat. Assoc. 100(471), 1021–1035 (2005)MathSciNetCrossRef Gelfand, A.E., Kottas, A., MacEachern, S.N.: Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Am. Stat. Assoc. 100(471), 1021–1035 (2005)MathSciNetCrossRef
13.
Zurück zum Zitat Gramacy, R.B.: Surrogates: Gaussian Process Modeling. Design and Optimization for the Applied Sciences. Chapman Hall/CRC, Boca Raton, Florida (2020)CrossRef Gramacy, R.B.: Surrogates: Gaussian Process Modeling. Design and Optimization for the Applied Sciences. Chapman Hall/CRC, Boca Raton, Florida (2020)CrossRef
14.
Zurück zum Zitat Gramacy, R.B., Apley, D.W.: Local Gaussian process approximation for large computer experiments. J. Comput. Graph. Stat. 24(2), 561–578 (2015)MathSciNetCrossRef Gramacy, R.B., Apley, D.W.: Local Gaussian process approximation for large computer experiments. J. Comput. Graph. Stat. 24(2), 561–578 (2015)MathSciNetCrossRef
15.
18.
Zurück zum Zitat Hu, Y.H., Hwang, J.N.: Handbook of neural network signal processing. Acoustical Society of America (2002) Hu, Y.H., Hwang, J.N.: Handbook of neural network signal processing. Acoustical Society of America (2002)
19.
Zurück zum Zitat Ilbert, O., et al.: COSMOS photometric redshifts with 30-bands for 2-deg2. Astrophys. J. 690(2), 1236–1249 (2009)CrossRef Ilbert, O., et al.: COSMOS photometric redshifts with 30-bands for 2-deg2. Astrophys. J. 690(2), 1236–1249 (2009)CrossRef
20.
Zurück zum Zitat Kaufman, C.G., Bingham, D., Habib, S., Heitmann, K., Frieman, J.A., et al.: Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology. Ann. Appl. Stat. 5(4), 2470–2492 (2011)MathSciNetCrossRef Kaufman, C.G., Bingham, D., Habib, S., Heitmann, K., Frieman, J.A., et al.: Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology. Ann. Appl. Stat. 5(4), 2470–2492 (2011)MathSciNetCrossRef
21.
Zurück zum Zitat Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)CrossRef Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)CrossRef
22.
Zurück zum Zitat Liaw, A., Wiener, M., et al.: Classification and regression by randomForest. R News 2(3), 18–22 (2002) Liaw, A., Wiener, M., et al.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
23.
Zurück zum Zitat Neal, R.M.: Regression and classification using Gaussian process priors. In: Bernardo, J.M., Berger, J.O., Dawid, A., Smith, A.F.M., et al. (eds.) Bayesian Stat., vol. 6, pp. 476–501. Oxford University Press, Oxford (1998) Neal, R.M.: Regression and classification using Gaussian process priors. In: Bernardo, J.M., Berger, J.O., Dawid, A., Smith, A.F.M., et al. (eds.) Bayesian Stat., vol. 6, pp. 476–501. Oxford University Press, Oxford (1998)
25.
Zurück zum Zitat Puschell, J.J., Owen, F.N., Laing, R.A.: Near-infrared photometry of distant radio galaxies - Spectral flux distributions and redshift estimates. Astrophys. J. Lett. 257, L57–L61 (1982)CrossRef Puschell, J.J., Owen, F.N., Laing, R.A.: Near-infrared photometry of distant radio galaxies - Spectral flux distributions and redshift estimates. Astrophys. J. Lett. 257, L57–L61 (1982)CrossRef
27.
Zurück zum Zitat Rasmussen, C.E., Ghahramani, Z.: Infinite mixtures of Gaussian process experts. Adv. Neural Inf. Process. Syst. 14, 881–888 (2001) Rasmussen, C.E., Ghahramani, Z.: Infinite mixtures of Gaussian process experts. Adv. Neural Inf. Process. Syst. 14, 881–888 (2001)
28.
Zurück zum Zitat Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA (2005)CrossRef Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA (2005)CrossRef
29.
Zurück zum Zitat Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer experiments. Stat. Sci. 4, 409–423 (1989)MathSciNetMATH Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer experiments. Stat. Sci. 4, 409–423 (1989)MathSciNetMATH
30.
Zurück zum Zitat Snelson, E., Ghahramani, Z.: Sparse Gaussian processes using pseudo-inputs. Adv. Neural Inf. Process. Syst. 18, 1257–1264 (2005) Snelson, E., Ghahramani, Z.: Sparse Gaussian processes using pseudo-inputs. Adv. Neural Inf. Process. Syst. 18, 1257–1264 (2005)
31.
Zurück zum Zitat Wang, J., Hertzmann, A., Fleet, D.J.: Gaussian process dynamical models. Adv. Neural Inf. Process. Syst. 18, 1441–1448 (2005) Wang, J., Hertzmann, A., Fleet, D.J.: Gaussian process dynamical models. Adv. Neural Inf. Process. Syst. 18, 1441–1448 (2005)
32.
Zurück zum Zitat Weinberg, D.H., Mortonson, M.J., Eisenstein, D.J., Hirata, C., Riess, A.G., Rozo, E.: Observational probes of cosmic acceleration. Phys. Reports 530, 87–255 (2013)MathSciNetCrossRef Weinberg, D.H., Mortonson, M.J., Eisenstein, D.J., Hirata, C., Riess, A.G., Rozo, E.: Observational probes of cosmic acceleration. Phys. Reports 530, 87–255 (2013)MathSciNetCrossRef
33.
Zurück zum Zitat York, D.G., et al.: The Sloan digital sky survey: technical summary. Astron. J. 120(3), 1579 (2000)CrossRef York, D.G., et al.: The Sloan digital sky survey: technical summary. Astron. J. 120(3), 1579 (2000)CrossRef
Metadaten
Titel
Scalable Statistical Inference of Photometric Redshift via Data Subsampling
verfasst von
Arindam Fadikar
Stefan M. Wild
Jonas Chaves-Montero
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-77977-1_19

Premium Partner