Skip to main content

2013 | Buch

The R Software

Fundamentals of Programming and Statistical Analysis

verfasst von: Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Verlag: Springer New York

Buchreihe : Statistics and Computing

insite
SUCHEN

Über dieses Buch

The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance. The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, optimization, descriptive statistics, simulations, confidence intervals and hypothesis testing, simple and multiple linear regression, and analysis of variance. Each statistical chapter in the second part relies on one or more real biomedical data sets, kindly made available by the Bordeaux School of Public Health (Institut de Santé Publique, d'Épidémiologie et de Développement - ISPED) and described at the beginning of the book. Each chapter ends with an assessment section: memorandum of most important terms, followed by a section of theoretical exercises (to be done on paper), which can be used as questions for a test. Moreover, worksheets enable the reader to check his new abilities in R. Solutions to all exercises and worksheets are included in this book.

Inhaltsverzeichnis

Frontmatter

Preliminaries

Frontmatter
Chapter 1. Introducing R
Abstract
R is a piece of statistical software created by Ross Ihaka and Robert Gentleman [21]. R is both a programming language and a work environment. Commands are executed using descriptive code. Results are displayed as text and the plots are visualized directly in their own window. R is clone of the statistical software S-plus.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 2. A Few Data Sets and Research Questions
Abstract
This chapter presents a few data sets from epidemiological studies analyzed by various teams at the Bordeaux School of Public Health (Institut de Santé publique, d’Epidémiologie et de Développement—ISPED). Each data set comes with a short research question, which will help understand the context of the study. They will be used throughout this book to show how to use the functionalities of R for importing and manipulating data and performing appropriate statistical analyses. For each data set, we give a table with a description, the variables and the coding.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

The Bases of R

Frontmatter
Chapter 3. Basic Concepts and Data Organisation
Abstract
This chapter introduces the basic concepts of the R software (calculator mode, assignment operator, variables, functions, arguments) and the various data types and structures which can be handled by R.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 4. Importing, Exporting and Producing Data
Abstract
Chapter 3. This chapter describes the instructions to enter data in R. It presents the various possibilities R offers to import or export data, to and from software as different as Excel, SPSS, Minitab, SAS or Matlab. It also shows how to interact with databases (SQL queries). You may benefit from reading the (very complete) manual http://cran.r-project.org/doc/manuals/R-data.pdf.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 5. Data Manipulation, Functions
Abstract
ne of the advantages of R is that it can operate on vectors and matrices.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 6. R and Its Documentation
Abstract
R includes an online help. It is very complete and very well structured for all functions and for the various symbols in the language. There are several ways to access the help files; the main method is help(). It is used in command line mode.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 7. Drawing Curves and Plots
Abstract
All plots created in R are displayed in special windows, separate from the console. They are called “R graphics: Device device-number”, where device-number is an integer giving the number of the window (or device).
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 8. Programming in R
Abstract
The strength of the R system is that it includes a real programming language. We shall see that it offers very original programming concepts. The concept of objects is very present in R. Object-oriented programming as used in R is transparent for the user, in the sense that you do not need to understand the theory in order to use it. The same cannot be said for the developer who wishes to respect the spirit of R.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 9. Managing Sessions
Abstract
Reading previous chapters. This chapter describes various procedures to manage R sessions. You have to follow a rather rigorous discipline and a methodology specific to R to make sure you save your work efficiently. We present the commands to save your work: objects you have created, instructions you have typed, plots you have drawn. We also present a few other useful commands and offer a short introduction to package creation.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet

Elementary Mathematics and Statistics

Frontmatter
Chapter 10. Basic Mathematics: Matrix Operations, Integration and Optimization
Abstract
This chapter describes basic mathematical functions. It then gives some usual operations on matrices and the most usual decompositions. We also present a few numerical integration and differentiation functions and the main optimization functions.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 11. Descriptive Statistics
Abstract
This chapter describes the procedures in R to structure your variables, draw standard summary plots of your data and calculate simple numerical statistical summaries on a data set. The data used to illustrate this chapter are from the data set NutriElderly. We also give a few examples of functions to produce prettier plots, useful for presentations or reports.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 12. A Better Understanding of Random Variables, Distributions and Simulations Using R Specificities
Abstract
We use the specificities of R to build empirically the notions of random variable, distribution of a random variable, law of large numbers and central limit theorem. We introduce some complex notions for statistical inference and examine sampling variation as well as the bias and variance of estimators. We go on to describe a few classical methods for simulating from a distribution. At the end of the chapter, we give commands to generate observations from common probability distributions and to calculate their probability and cumulative distribution functions and quantiles.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 13. Confidence Intervals and Hypothesis Testing
Abstract
This chapter is a catalogue of R functions commonly used to get confidence intervals for usual parameters: mean, proportion, variance, median and correlation. We also present a catalogue of R functions to perform standard hypothesis testing. Furthermore, a few practical worksheets will help the reader understand how to interpret confidence intervals, as well as the various errors related to hypothesis testing.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 14. Simple and Multiple Linear Regression
Abstract
This chapter is a brief introduction to simple and multiple linear regression and how to use this method in a real context (see [41] for a more complete presentation). We present the relevant R commands and use a real data set as a connecting thread as we present the key concepts for this method. We treat the case of qualitative explanatory variables, as well as interaction of explanatory variables. We discuss model validation with a study of residuals and mention the issue of collinearity. We also present a few methods for variable selection.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Chapter 15. Elementary Analysis of Variance
Abstract
Read Chap. 14. This chapter describes the various R commands to perform analysis of variance. We present the standard cases of analysis of variance with 1 factor and 2 factors with or without interaction. We also introduce repeated measures analysis of variance.
Pierre Lafaye de Micheaux, Rémy Drouilhet, Benoit Liquet
Backmatter
Metadaten
Titel
The R Software
verfasst von
Pierre Lafaye de Micheaux
Rémy Drouilhet
Benoit Liquet
Copyright-Jahr
2013
Verlag
Springer New York
Electronic ISBN
978-1-4614-9020-3
Print ISBN
978-1-4614-9019-7
DOI
https://doi.org/10.1007/978-1-4614-9020-3

Premium Partner