Skip to main content
Top

2019 | OriginalPaper | Chapter

Experimental Design Issues in Big Data: The Question of Bias

Authors : Elena Pesce, Eva Riccomagno, Henry P. Wynn

Published in: Statistical Learning of Complex Data

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data can be collected in scientific studies via a controlled experiment or passive observation. Big data is often collected in a passive way, e.g. from social media. In studies of causation great efforts are made to guard against bias and hidden confounders or feedback which can destroy the identification of causation by corrupting or omitting counterfactuals (controls). Various solutions of these problems are discussed, including randomisation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Box, G.E., Draper, N.R.: A basis for the selection of a response surface design. J. Am. Stat. Assoc. 54(287), 622–654 (1959)MathSciNetCrossRef Box, G.E., Draper, N.R.: A basis for the selection of a response surface design. J. Am. Stat. Assoc. 54(287), 622–654 (1959)MathSciNetCrossRef
3.
go back to reference Drovandi, C.C., Holmes, C., McGree, J.M., Mengersen, K., Richardson, S., Ryan, E.G.: Principles of experimental design for big data analysis. Stat. Sci. 32(3), 385–404 (2017)MathSciNetCrossRef Drovandi, C.C., Holmes, C., McGree, J.M., Mengersen, K., Richardson, S., Ryan, E.G.: Principles of experimental design for big data analysis. Stat. Sci. 32(3), 385–404 (2017)MathSciNetCrossRef
4.
go back to reference Drton, M., Weihs, L.: Generic identifiability of linear structural equation models by ancestor decomposition. Scand. J. Stat. 43(4), 1035–1045 (2016)MathSciNetCrossRef Drton, M., Weihs, L.: Generic identifiability of linear structural equation models by ancestor decomposition. Scand. J. Stat. 43(4), 1035–1045 (2016)MathSciNetCrossRef
5.
go back to reference Grant, W.C., Anstrom, K.J.: Minimizing selection bias in randomized trials: A Nash equilibrium approach to optimal randomization. J. Econ. Behav. Organ. 66(3), 606–624 (2008)CrossRef Grant, W.C., Anstrom, K.J.: Minimizing selection bias in randomized trials: A Nash equilibrium approach to optimal randomization. J. Econ. Behav. Organ. 66(3), 606–624 (2008)CrossRef
6.
go back to reference Hainy, M., Müller, W.G., Wynn, H.P.: Approximate Bayesian computation design (ABCD), an introduction. In: mODa 10–Advances in Model-Oriented Design and Analysis, pp. 135–143. Springer, Heidelberg (2013) Hainy, M., Müller, W.G., Wynn, H.P.: Approximate Bayesian computation design (ABCD), an introduction. In: mODa 10–Advances in Model-Oriented Design and Analysis, pp. 135–143. Springer, Heidelberg (2013)
7.
go back to reference Hainy, M., Müller, W.G., Wynn, H.P.: Learning functions and approximate Bayesian computation design: ABCD. Entropy 16(8), 4353–4374 (2014)MathSciNetCrossRef Hainy, M., Müller, W.G., Wynn, H.P.: Learning functions and approximate Bayesian computation design: ABCD. Entropy 16(8), 4353–4374 (2014)MathSciNetCrossRef
8.
go back to reference LaLonde, R.J.: Evaluating the econometric evaluations of training programs with experimental data. Am. Econ. Rev. 76, 604–620 (1986) LaLonde, R.J.: Evaluating the econometric evaluations of training programs with experimental data. Am. Econ. Rev. 76, 604–620 (1986)
9.
go back to reference Montepiedra, G., Fedorov, V.V.: Minimum bias designs with constraints. J. Stat. Plan. Infer. 63(1), 97–111 (1997)MathSciNetCrossRef Montepiedra, G., Fedorov, V.V.: Minimum bias designs with constraints. J. Stat. Plan. Infer. 63(1), 97–111 (1997)MathSciNetCrossRef
10.
11.
go back to reference Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)MathSciNetCrossRef Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)MathSciNetCrossRef
12.
13.
go back to reference Sebastiani, P., Wynn, H.P.: Maximum entropy sampling and optimal Bayesian experimental design. J. R. Stat. Soc. B 62(1), 145–157 (2000)MathSciNetCrossRef Sebastiani, P., Wynn, H.P.: Maximum entropy sampling and optimal Bayesian experimental design. J. R. Stat. Soc. B 62(1), 145–157 (2000)MathSciNetCrossRef
14.
15.
go back to reference Wang, H., Yang, M., Stufken, J.: Information-based optimal subdata selection for big data linear regression. J. Am. Stat. Assoc. 114(525), 393–405 (2019)MathSciNetCrossRef Wang, H., Yang, M., Stufken, J.: Information-based optimal subdata selection for big data linear regression. J. Am. Stat. Assoc. 114(525), 393–405 (2019)MathSciNetCrossRef
16.
Metadata
Title
Experimental Design Issues in Big Data: The Question of Bias
Authors
Elena Pesce
Eva Riccomagno
Henry P. Wynn
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-21140-0_20

Premium Partner