Weitere Kapitel dieses Buchs durch Wischen aufrufen
One goal of statistical studies is to highlight associations between pairs of variables. This is particularly useful when one wants to get a clear picture of a multi-dimensional data set and motivate a specific policy intervention (Sect. 4.1). Yet, the choice of a method is not straightforward. Testing for correlation is the relevant approach to investigate a linear association between two numerical variables (Sect. 4.2). The chi-square test is an inferential test that uses data from a sample to make conclusions about the relationship between two categorical variables (Sect. 4.3). When one variable is numerical and the other is categorical, the usual approach is to test for differences between means or to implement an analysis of variance (Sect. 4.4). When faced with more than two variables, it is also possible to provide a multidimensional representation of the problem using methods such as principal component analysis (Sect. 4.5) and multiple correspondence analysis (Sect. 4.6). The idea is to reduce the dimensionality of a data set by plotting all the observations on 2D graphs describing how observations cluster with respect to various characteristics. These groups can for instance serve to identify the beneficiaries of a particular intervention. Using R-CRAN, several examples are included in this chapter to illustrate the different methods.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Galton, F. (1877). Typical laws of heredity. Nature, 15, 492–495. CrossRef
Galton, F. (1889). Natural inheritance. London: Macmillan. CrossRef
Giudici, P. (2005). Applied data mining: Statistical methods for business and industry. New York: Wiley.
Lang, T. A., & Secic, M. (2006). How to report statistics in medicine: Annotated guidelines for authors, editors, and reviewers. Philadelphia, PA: ACP.
MacDonell, W. R. (1902). On criminal anthropometry and the identification of criminals. Biometrika, 1, 177–227. CrossRef
Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series, 5, 157–175. CrossRef
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series, 6, 559–572. CrossRef
Pearson, K. (1906). On certain points connected with scale order in the case of a correlation of two characters which for some arrangement give a linear regression line. Biometrika, 5, 176–178.
Rosenthal, G., & Rosenthal, J. A. (2011). Statistics and data interpretation for social work. New York: Springer.
Tufféry, S. (2011). Data mining and statistics for decision making. Wiley.
- Measuring and Visualizing Associations
Benoît Le Maux
Neuer Inhalt/© Stellmach, Neuer Inhalt/© Maturus, Pluta Logo/© Pluta, Frankfurt School