Skip to main content
Top

2014 | Book

Foundations of Applied Statistical Methods

insite
SEARCH

About this book

This is a text in methods of applied statistics for researchers who design and conduct experiments, perform statistical inference, and write technical reports. These research activities rely on an adequate knowledge of applied statistics. The reader both builds on basic statistics skills and learns to apply it to applicable scenarios without over-emphasis on the technical aspects. Demonstrations are a very important part of this text. Mathematical expressions are exhibited only if they are defined or intuitively comprehensible. This text may be used as a self review guidebook for applied researchers or as an introductory statistical methods textbook for students not majoring in statistics.​ Discussion includes essential probability models, inference of means, proportions, correlations and regressions, methods for censored survival time data analysis, and sample size determination.

The author has over twenty years of experience on applying statistical methods to study design and data analysis in collaborative medical research setting as well as on teaching. He received his PhD from University of Southern California Department of Preventive Medicine, received a post-doctoral training at Harvard Department of Biostatistics, has held faculty appointments at UCLA School of Medicine and Harvard Medical School, and currently a biostatistics faculty member at Massachusetts General Hospital and Harvard Medical School in Boston, Massachusetts, USA.

Table of Contents

Frontmatter
Chapter 1. Warming Up: Descriptive Statistics and Essential Probability Models
Abstract
This chapter portrays how to make sense of gathered data before performing the formal statistical inference. The covered topics are types of data, how to visualize data, how to summarize data into few descriptive statistics (i.e., condensed numerical indices), and introduction to some useful probability models.
Hang Lee
Chapter 2. Statistical Inference Focusing on a Single Mean
Abstract
Statistical inference is to infer whether or not the observed sample data are evidencing the population characteristics of interest. If the whole population data were gathered collectively then there is no room for uncertainty about the population due to a sampling and the statistical inference is unnecessary. It is ideal but unrealistic to collect the whole population data and complete the investigation solely by descriptive data analysis. For this reason, a smaller size of sample data set than that of the whole population is gathered for an investigation. Since the sample data set does not populate the entire population, it is not identical to the population. This chapter will discuss the relationship between the population and sample by addressing (1) the uncertainty and errors in the sample, (2) underpinnings that are necessary for a sound understanding of the applied methods of statistical inference, (3) forms and paradigms of drawing inference, and (4) good study design as a solution to minimize the unavoidable errors contained in the sampling.
Hang Lee
Chapter 3. t-Tests for Two Means Comparisons
Abstract
In Chap. 2, the one-sample t-test was introduced to test whether or not a single mean of a population is equal to a certain value. Chapter 3 will introduce the extension of the t-test to examine whether or not the difference between the two population means is equal to a certain value. Two situations will be discussed of which the first is when the two means are from independent (i.e., unrelated) populations, and the second is when the two means are from related populations.
Hang Lee
Chapter 4. Inference Using Analysis of Variance for Comparing Multiple Means
Abstract
This chapter discusses single-factor analysis of variance (ANOVA) which is mainly applied to compare three or more independent means. The words “single factor” refer to that the means are compared across levels of a “single” classification variable (i.e., classification of means by a single categorical variable). The classification variable is called independent variable or factor (thus, the method is also called single-factor ANOVA) and the outcome variable of which the means are compared is called dependent variable. This method requires certain assumptions: (1) the dependent variable values are the observations sampled from a normal distribution and (2) the population variances are equal (homoscedasticity) across the levels of the independent variable.
Hang Lee
Chapter 5. Linear Correlation and Regression
Abstract
In Chap. 1, Pearson’s correlation coefficient as a means to describe a linear association between two continuous measures was introduced. In this chapter, the inference of the correlation coefficient using sample data will be discussed first, and then the discussion will extend to a related method and its inference to examine a linear association of the continuous and binary outcomes with one or more variables using sample data.
Hang Lee
Chapter 6. Normal Distribution Assumption-Free Nonparametric Inference
Abstract
Methods for categorical data analysis and rank-based nonparametric methods for continuous data are discussed.
Hang Lee
Chapter 7. Methods for Censored Survival Time Data
Abstract
This chapter deals with survival time data for which the inference cannot be made by any of the parametric and nonparametric methods that we learned in the past chapters.
Typical survival time data type is time to event (e.g., time to death, time to treatment failure, time to disease relapse, time to recovery after surgery). The word “survival time” was originated from the time to death event. The survival time distribution commonly appears as nonsymmetrical and another unique feature of these data is censoring, which is difficult to deal with using the methods covered in the previous chapters.
Some common statistical questions are: What is the probability that a subject in a group would survive longer than t years?; What is the median survival time of the group (i.e., what is the time point when half of the subjects would remain alive)?; and Are the survival time distributions significantly different between the two groups (i.e., did Group A survive longer than Group B on average, etc.)?
This chapter lets you walk through two examples that illustrate the typical survival time data with censoring and how to tackle the data analysis problem.
Hang Lee
Chapter 8. Sample Size and Power
Abstract
The idea and examples to determine the adequate study sample size for making inferences about one and two means and proportions are discussed.
Hang Lee
Chapter 9. Review Exercise Problems
Abstract
Review Exercise 1
Hang Lee
Chapter 10. Probability Distribution of Standard Normal Distribution
Abstract
Cumulative probability distribution of standard normal distribution
Hang Lee
Chapter 11. Percentiles of t-Distributions
Abstract
Absolute value of t statistic (i.e., |t|) given df and tail (both upper and lower tails) probability
Hang Lee
Chapter 12. Upper 95th and 99th Percentiles of Chi-Square Distributions
Abstract
Upper 95th (5 % upper tail) and 99th (1 % upper tail) percentiles of chi-square distributions
Hang Lee
Chapter 13. Upper 95th Percentiles of F-Distributions
Abstract
Upper 95th percentiles of F-distributions
Hang Lee
Chapter 14. Upper 99th Percentiles of F-Distributions
Abstract
Upper 99th percentiles of F-distributions
Hang Lee
Chapter 15. Sample Sizes for Independent Samples t-Tests
Abstract
Sample size per group for two-group independent samples t-test (normal approximation)
Hang Lee
Erratum: Foundations of Applied Statistical Methods
Hang Lee
Backmatter
Metadata
Title
Foundations of Applied Statistical Methods
Author
Hang Lee
Copyright Year
2014
Electronic ISBN
978-3-319-02402-8
Print ISBN
978-3-319-02401-1
DOI
https://doi.org/10.1007/978-3-319-02402-8

Premium Partner