Skip to main content
main-content
Top

About this book

The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance. The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, optimization, descriptive statistics, simulations, confidence intervals and hypothesis testing, simple and multiple linear regression, and analysis of variance. Each statistical chapter in the second part relies on one or more real biomedical data sets, kindly made available by the Bordeaux School of Public Health (Institut de Santé Publique, d'Épidémiologie et de Développement - ISPED) and described at the beginning of the book. Each chapter ends with an assessment section: memorandum of most important terms, followed by a section of theoretical exercises (to be done on paper), which can be used as questions for a test. Moreover, worksheets enable the reader to check his new abilities in R. Solutions to all exercises and worksheets are included in this book.

Table of Contents

Frontmatter

Preliminaries

Frontmatter

Chapter 1. Introducing R

Abstract
R is a piece of statistical software created by Ross Ihaka and Robert Gentleman [21]. R is both a programming language and a work environment. Commands are executed using descriptive code. Results are displayed as text and the plots are visualized directly in their own window. R is clone of the statistical software S-plus.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 2. A Few Data Sets and Research Questions

Abstract
This chapter presents a few data sets from epidemiological studies analyzed by various teams at the Bordeaux School of Public Health (Institut de Santé publique, d’Epidémiologie et de Développement—ISPED). Each data set comes with a short research question, which will help understand the context of the study. They will be used throughout this book to show how to use the functionalities of R for importing and manipulating data and performing appropriate statistical analyses. For each data set, we give a table with a description, the variables and the coding.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

The Bases of R

Frontmatter

Chapter 3. Basic Concepts and Data Organisation

Abstract
This chapter introduces the basic concepts of the R software (calculator mode, assignment operator, variables, functions, arguments) and the various data types and structures which can be handled by R.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 4. Importing, Exporting and Producing Data

Abstract
Chapter 3. This chapter describes the instructions to enter data in R. It presents the various possibilities R offers to import or export data, to and from software as different as Excel, SPSS, Minitab, SAS or Matlab. It also shows how to interact with databases (SQL queries). You may benefit from reading the (very complete) manual http://cran.r-project.org/doc/manuals/R-data.pdf.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 5. Data Manipulation, Functions

Abstract
ne of the advantages of R is that it can operate on vectors and matrices.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 6. R and Its Documentation

Abstract
R includes an online help. It is very complete and very well structured for all functions and for the various symbols in the language. There are several ways to access the help files; the main method is help(). It is used in command line mode.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 7. Drawing Curves and Plots

Abstract
All plots created in R are displayed in special windows, separate from the console. They are called “R graphics: Device device-number”, where device-number is an integer giving the number of the window (or device).
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 8. Programming in R

Abstract
The strength of the R system is that it includes a real programming language. We shall see that it offers very original programming concepts. The concept of objects is very present in R. Object-oriented programming as used in R is transparent for the user, in the sense that you do not need to understand the theory in order to use it. The same cannot be said for the developer who wishes to respect the spirit of R.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 9. Managing Sessions

Abstract
Reading previous chapters. This chapter describes various procedures to manage R sessions. You have to follow a rather rigorous discipline and a methodology specific to R to make sure you save your work efficiently. We present the commands to save your work: objects you have created, instructions you have typed, plots you have drawn. We also present a few other useful commands and offer a short introduction to package creation.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Elementary Mathematics and Statistics

Frontmatter

Chapter 10. Basic Mathematics: Matrix Operations, Integration and Optimization

Abstract
This chapter describes basic mathematical functions. It then gives some usual operations on matrices and the most usual decompositions. We also present a few numerical integration and differentiation functions and the main optimization functions.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 11. Descriptive Statistics

Abstract
This chapter describes the procedures in R to structure your variables, draw standard summary plots of your data and calculate simple numerical statistical summaries on a data set. The data used to illustrate this chapter are from the data set NutriElderly. We also give a few examples of functions to produce prettier plots, useful for presentations or reports.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 12. A Better Understanding of Random Variables, Distributions and Simulations Using R Specificities

Abstract
We use the specificities of R to build empirically the notions of random variable, distribution of a random variable, law of large numbers and central limit theorem. We introduce some complex notions for statistical inference and examine sampling variation as well as the bias and variance of estimators. We go on to describe a few classical methods for simulating from a distribution. At the end of the chapter, we give commands to generate observations from common probability distributions and to calculate their probability and cumulative distribution functions and quantiles.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 13. Confidence Intervals and Hypothesis Testing

Abstract
This chapter is a catalogue of R functions commonly used to get confidence intervals for usual parameters: mean, proportion, variance, median and correlation. We also present a catalogue of R functions to perform standard hypothesis testing. Furthermore, a few practical worksheets will help the reader understand how to interpret confidence intervals, as well as the various errors related to hypothesis testing.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 14. Simple and Multiple Linear Regression

Abstract
This chapter is a brief introduction to simple and multiple linear regression and how to use this method in a real context (see [41] for a more complete presentation). We present the relevant R commands and use a real data set as a connecting thread as we present the key concepts for this method. We treat the case of qualitative explanatory variables, as well as interaction of explanatory variables. We discuss model validation with a study of residuals and mention the issue of collinearity. We also present a few methods for variable selection.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Chapter 15. Elementary Analysis of Variance

Abstract
Read Chap. 14. This chapter describes the various R commands to perform analysis of variance. We present the standard cases of analysis of variance with 1 factor and 2 factors with or without interaction. We also introduce repeated measures analysis of variance.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Backmatter

Additional information

Premium Partner

    Image Credits