Skip to main content
Top

2011 | Book

R for SAS and SPSS Users

insite
SEARCH

About this book

R is a powerful and free software system for data analysis and graphics, with over 5,000 add-on packages available. This book introduces R using SAS and SPSS terms with which you are already familiar. It demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download.

The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses.

This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.

Table of Contents

Frontmatter
1. Introduction
Abstract
Norman Nie, one of the founders of SPSS, calls R [55] “The most powerful statistical computing language on the planet.”1 Written by Ross Ihaka, Robert Gentleman, the R Core Development Team, and an army of volunteers, R provides both a language and a vast array of analytical and graphical procedures. The fact that this level of power is available free of charge has dramatically changed the landscape of research software.
Robert A. Muenchen
2. Installing and Updating R
Abstract
When you purchase SAS, WPS or SPSS, they sell you a “binary”version. That is one that the company has compiled for you from the “source code” version they wrote using languages such as C, FORTRAN, or Java. You usually install everything you purchased at once and do not give it a second thought. Instead, R is modular. The main installation provides Base R and a recommended set of add-on modules called packages. You can install other packages later when you need them. With thousands to choose from, few people need them all.
Robert A. Muenchen
3. Running R
Abstract
There are several ways you can run R: Interactively using its programming language: You can see the result of each command immediately after you submit it; Interactively using one of several GUIs that you can add on to R: Some of these use programming while others help you avoid programming by using menus and dialog boxes like SPSS, ribbons like Microsoft Office, or flowcharts like SAS Enterprise Guide or SPSS Modeler (formerly Clementine); Noninteractively in batch mode using its programming language: You enter your program into a file and run it all at once. From within another package, such as Excel, SAS, or SPSS.
Robert A. Muenchen
4. Help and Documentation
Abstract
R has an extensive array of help files and documentation. However, they can be somewhat intimidating at first, since many of them assume you already know a lot about R.
Robert A. Muenchen
5. Programming Language Basics
Abstract
In this chapter we will go through the fundamental features in R. It will be helpful if you can download the book’s files from the Web site http://r4stats.com and run each line as we discuss it. Many of our examples will use our practice data set described in Sect. 1.7.
Robert A. Muenchen
6. Data Acquisition
Abstract
You can enter data directly into R, and you can read data from a wide range of sources. In this chapter I will demonstrate R’s data editor as well as reading and writing data in text, Excel, SAS, SPSS and ODBC formats. For other topics, especially regarding relational databases, see the R Data Import/Export manual [46]. If you are reading data that contain dates or times, see Sect. 10.21.
Robert A. Muenchen
7. Selecting Variables
Abstract
In SAS and SPSS, selecting variables for an analysis is simple, while selecting observations is often much more complicated. In R, these two processes can be almost identical. As a result, variable selection in R is both more flexible and quite a bit more complex. However, since you need to learn that complexity to select observations, it does not require much added effort.
Robert A. Muenchen
8. Selecting Observations
Abstract
It bears repeating that the approaches that R uses to select observations are, for the most part, the same as those discussed in the previous chapter for selecting variables. This chapter builds on that one, so if you have not read it recently, now would be a good time to do so.
Robert A. Muenchen
9. Selecting Variables and Observations
Abstract
In SAS and SPSS, variable selection is done using a very simple yet flexible set of commands using variable names, and the selection of observations is done using logic.
Robert A. Muenchen
10. Data Management
Abstract
An old rule of thumb says that 80% of your data analysis time is spent transforming, reshaping, merging, and otherwise managing your data. SAS and SPSS have a reputation of being more flexible than R for data management. However, as you will see in this chapter, R can do everything SAS and SPSS can do on these important tasks
Robert A. Muenchen
11. Enhancing Your Output
Abstract
As we have seen, compared to SAS or SPSS, R output is quite sparse and not nicely formatted for word processing. You can improve R’s output by adding value and variable labels. You can also format the output to make beautiful tables to use with word processors, Web pages, and document preparation systems.
Robert A. Muenchen
12. Generating Data
Abstract
Generating data is far more important to R users than it is to SAS or SPSS users. As we have seen, many R functions are controlled by numeric, character, or logical vectors. You can generate those vectors using the methods in this chapter, making quick work of otherwise tedious tasks.
Robert A. Muenchen
13. Managing Your Files and Workspace
Abstract
When using SAS and SPSS, you manage your files using the same operating system commands that you use for your other software. SAS does have a few file management procedures such as DATASETS and CATALOG, but you can get by just fine without them for most purposes.
Robert A. Muenchen
14. Graphics Overview
Abstract
Graphics is perhaps the most difficult topic to compare across SAS, SPSS, and R. Each package contains at least two graphical approaches, each with dozens of options and each with entire books devoted to them. Therefore, we will focus on only two main approaches in R, and we will discuss many more examples in R than in SAS or SPSS. This chapter focuses on a broad, high-level comparison of the three. The next chapter focuses on R’s traditional graphics. The one after that focuses just on the grammar of graphics approaches used in both R and SPSS.
Robert A. Muenchen
15. Traditional Graphics
Abstract
In the previous chapter, we discussed the various graphics packages in R, SAS, and SPSS. Now we will delve into R’s traditional, or base, graphics. Many of these examples will use the practice data set mydata100, which is described in Sect. 1.7
Robert A. Muenchen
16. Graphics with ggplot2
Abstract
As we discussed in Chap. 14, “Graphics Overview,” the ggplot2 package is an implementation of Wilkinson’s grammar of graphics (hence the “gg” in its name). The last chapter focused on R’s traditional graphics functions. Many plots were easy, but other plots were a lot of work compared to SAS or SPSS. In particular, adding things like legends and confidence intervals was complicated.
Robert A. Muenchen
17. Statistics
Abstract
This chapter demonstrates some basic statistical methods. More importantly, it shows how even in the realm of fairly standard analyses, R differs sharply from the approach used by SAS and SPSS. Since this book is aimed at people who already know SAS or SPSS, I assume you are already familiar with most of these methods. I briefly list each test’s goal and assumptions and how to get R to perform them. For more statistical coverage see Dalgaard’s Introductory Statistics with R [16], or Venable and Ripley’s much more advanced Modern Applied Statistics with S [65].
Robert A. Muenchen
18. Conclusion
Abstract
As we have seen, R differs from SAS and SPSS in many ways. R has a host of features that the other programs lack such as functions whose internal workings you can see and change, fully integrated macro and matrix capabilities, the most extensive selection of analytic methods available, and a level of flexibility that extends all the way to the core of the system. A detailed comparison of R with SAS and SPSS is contained in Appendix B.
Robert A. Muenchen
Backmatter
Metadata
Title
R for SAS and SPSS Users
Author
Robert A. Muenchen
Copyright Year
2011
Publisher
Springer New York
Electronic ISBN
978-1-4614-0685-3
Print ISBN
978-1-4614-0684-6
DOI
https://doi.org/10.1007/978-1-4614-0685-3

Premium Partner