Skip to main content
main-content
Top

About this book

R is a powerful and free software system for data analysis and graphics, with over 5,000 add-on packages available. This book introduces R using SAS and SPSS terms with which you are already familiar. It demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download.

The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses.

This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.

Table of Contents

Frontmatter

1. Introduction

Abstract
Norman Nie, one of the founders of SPSS, calls R [55] “The most powerful statistical computing language on the planet.”1 Written by Ross Ihaka, Robert Gentleman, the R Core Development Team, and an army of volunteers, R provides both a language and a vast array of analytical and graphical procedures. The fact that this level of power is available free of charge has dramatically changed the landscape of research software.
Robert A. Muenchen

2. Installing and Updating R

Abstract
When you purchase SAS, WPS or SPSS, they sell you a “binary”version. That is one that the company has compiled for you from the “source code” version they wrote using languages such as C, FORTRAN, or Java. You usually install everything you purchased at once and do not give it a second thought. Instead, R is modular. The main installation provides Base R and a recommended set of add-on modules called packages. You can install other packages later when you need them. With thousands to choose from, few people need them all.
Robert A. Muenchen

3. Running R

Abstract
There are several ways you can run R: Interactively using its programming language: You can see the result of each command immediately after you submit it; Interactively using one of several GUIs that you can add on to R: Some of these use programming while others help you avoid programming by using menus and dialog boxes like SPSS, ribbons like Microsoft Office, or flowcharts like SAS Enterprise Guide or SPSS Modeler (formerly Clementine); Noninteractively in batch mode using its programming language: You enter your program into a file and run it all at once. From within another package, such as Excel, SAS, or SPSS.
Robert A. Muenchen

4. Help and Documentation

Abstract
R has an extensive array of help files and documentation. However, they can be somewhat intimidating at first, since many of them assume you already know a lot about R.
Robert A. Muenchen

5. Programming Language Basics

Abstract
In this chapter we will go through the fundamental features in R. It will be helpful if you can download the book’s files from the Web site http://r4stats.com and run each line as we discuss it. Many of our examples will use our practice data set described in Sect. 1.7.
Robert A. Muenchen

6. Data Acquisition

Abstract
You can enter data directly into R, and you can read data from a wide range of sources. In this chapter I will demonstrate R’s data editor as well as reading and writing data in text, Excel, SAS, SPSS and ODBC formats. For other topics, especially regarding relational databases, see the R Data Import/Export manual [46]. If you are reading data that contain dates or times, see Sect. 10.21.
Robert A. Muenchen

7. Selecting Variables

Abstract
In SAS and SPSS, selecting variables for an analysis is simple, while selecting observations is often much more complicated. In R, these two processes can be almost identical. As a result, variable selection in R is both more flexible and quite a bit more complex. However, since you need to learn that complexity to select observations, it does not require much added effort.
Robert A. Muenchen

8. Selecting Observations

Abstract
It bears repeating that the approaches that R uses to select observations are, for the most part, the same as those discussed in the previous chapter for selecting variables. This chapter builds on that one, so if you have not read it recently, now would be a good time to do so.
Robert A. Muenchen

9. Selecting Variables and Observations

Abstract
In SAS and SPSS, variable selection is done using a very simple yet flexible set of commands using variable names, and the selection of observations is done using logic.
Robert A. Muenchen

10. Data Management

Abstract
An old rule of thumb says that 80% of your data analysis time is spent transforming, reshaping, merging, and otherwise managing your data. SAS and SPSS have a reputation of being more flexible than R for data management. However, as you will see in this chapter, R can do everything SAS and SPSS can do on these important tasks
Robert A. Muenchen

11. Enhancing Your Output

Abstract
As we have seen, compared to SAS or SPSS, R output is quite sparse and not nicely formatted for word processing. You can improve R’s output by adding value and variable labels. You can also format the output to make beautiful tables to use with word processors, Web pages, and document preparation systems.
Robert A. Muenchen

12. Generating Data

Abstract
Generating data is far more important to R users than it is to SAS or SPSS users. As we have seen, many R functions are controlled by numeric, character, or logical vectors. You can generate those vectors using the methods in this chapter, making quick work of otherwise tedious tasks.
Robert A. Muenchen

13. Managing Your Files and Workspace

Abstract
When using SAS and SPSS, you manage your files using the same operating system commands that you use for your other software. SAS does have a few file management procedures such as DATASETS and CATALOG, but you can get by just fine without them for most purposes.
Robert A. Muenchen

14. Graphics Overview

Abstract
Graphics is perhaps the most difficult topic to compare across SAS, SPSS, and R. Each package contains at least two graphical approaches, each with dozens of options and each with entire books devoted to them. Therefore, we will focus on only two main approaches in R, and we will discuss many more examples in R than in SAS or SPSS. This chapter focuses on a broad, high-level comparison of the three. The next chapter focuses on R’s traditional graphics. The one after that focuses just on the grammar of graphics approaches used in both R and SPSS.
Robert A. Muenchen

15. Traditional Graphics

Abstract
In the previous chapter, we discussed the various graphics packages in R, SAS, and SPSS. Now we will delve into R’s traditional, or base, graphics. Many of these examples will use the practice data set mydata100, which is described in Sect. 1.7
Robert A. Muenchen

16. Graphics with ggplot2

Abstract
As we discussed in Chap. 14, “Graphics Overview,” the ggplot2 package is an implementation of Wilkinson’s grammar of graphics (hence the “gg” in its name). The last chapter focused on R’s traditional graphics functions. Many plots were easy, but other plots were a lot of work compared to SAS or SPSS. In particular, adding things like legends and confidence intervals was complicated.
Robert A. Muenchen

17. Statistics

Abstract
This chapter demonstrates some basic statistical methods. More importantly, it shows how even in the realm of fairly standard analyses, R differs sharply from the approach used by SAS and SPSS. Since this book is aimed at people who already know SAS or SPSS, I assume you are already familiar with most of these methods. I briefly list each test’s goal and assumptions and how to get R to perform them. For more statistical coverage see Dalgaard’s Introductory Statistics with R [16], or Venable and Ripley’s much more advanced Modern Applied Statistics with S [65].
Robert A. Muenchen

18. Conclusion

Abstract
As we have seen, R differs from SAS and SPSS in many ways. R has a host of features that the other programs lack such as functions whose internal workings you can see and change, fully integrated macro and matrix capabilities, the most extensive selection of analytic methods available, and a level of flexibility that extends all the way to the core of the system. A detailed comparison of R with SAS and SPSS is contained in Appendix B.
Robert A. Muenchen

Backmatter

Additional information

Premium Partner

    Image Credits