Statistics columnTransform your data
Section snippets
Lesson 2: Transform your data
Transforming your data means applying a non-linear function to your data—usually a log, square-root, or reciprocal function—and analyzing the results rather than the raw data. You may decide not to transform your data—if the raw data are symmetrically distributed, for example—but that should be a conscious choice. Transforming your data before analysis is like focusing a camera before taking a picture. It is almost always worthwhile and makes everything clearer. Many measurements used in
Visual clarity
An example of better visual clarity comes from Guilhardi and Church [3]. They did two experiments in which they trained rats to poke their heads into a food cup to get food. Each experiment had two phases: training (during which head pokes were rewarded) and extinction (during which head pokes stopped being rewarded).
During extinction, head pokes became less frequent. This has been observed countless times in learning experiments. To their great credit, Guilhardi and Church managed to see
Statistical clarity
The data from Guilhardi and Church also show how transformations can increase statistical clarity. Did the spread of interresponse times increase from training to extinction, as Figure 2 implies? To find out, for each rat (there were 24 rats) we can compute the standard deviation of interresponse times during 1) training and 2) extinction. Then we can compare the two sets of 24 standard deviations using a t test.
The result of that t test depends on the numbers used to compute the standard
Acknowledgments
The author thanks Saul Sternberg for helpful comments and Paulo Guilhardi for data.
References (4)
- et al.
Influence of inulin on plasma isoflavone concentrations in healthy postmenopausal women
Am J Clin Nut
(2007) Exploratory data analysis
(1977)
Cited by (14)
Investigating potential associations between neurocognition/social cognition and oxidative stress in schizophrenia
2021, Psychiatry ResearchCitation Excerpt :Multiple linear regression analysis was also carried out to evaluate predictors of digit span test (working memory). Score obtained in Hinting Task was normalized using reflected logarithm (Roberts, 2008). Socio-demographic and clinical data are shown in Table 1.
The modulation of operant variation by the probability, magnitude, and delay of reinforcement
2011, Learning and MotivationCitation Excerpt :Responses to the touchscreen during the ITI (i.e., when the screen was completely blank) were not recorded. Because interresponse times (IRTs) were positively skewed, we performed a log10transform to normalize the data prior to analysis (cf. Stahlman, Young, et al., 2010; for a discussion on the importance of transformations, see also Roberts, 2008; Tukey, 1977). We then calculated the standard deviation of log IRT within session as our measure of temporal variation.
Plot your data
2009, NutritionEffect of Reward Probability on Spatial and Temporal Variation
2010, Journal of Experimental Psychology: Animal Behavior ProcessesFrom a Sampling Precision Perspective, Skewness Is a Friend and Not an Enemy!
2019, Educational and Psychological Measurement