Skip to main content
main-content
Top

About this book

1. 1 Welcome to ggplot2 ggplot2 is an R package for producing statistical, or data, graphics, but it is unlike most other graphics packages because it has a deep underlying grammar. This grammar, based on the Grammar of Graphics (Wilkinson, 2005), is composed of a set of independent components that can be composed in many di?erent ways. This makesggplot2 very powerful, because you are not limited to a set of pre-speci?ed graphics, but you can create new graphics that are precisely tailored for your problem. This may sound overwhelming, but because there is a simple set of core principles and very few special cases, ggplot2 is also easy to learn (although it may take a little time to forget your preconceptions from other graphics tools). Practically,ggplot2 provides beautiful, hassle-free plots, that take care of ?ddly details like drawing legends. The plots can be built up iteratively and edited later. A carefully chosen set of defaults means that most of the time you can produce a publication-quality graphic in seconds, but if you do have special formatting requirements, a comprehensive theming system makes it easy to do what you want. Instead of spending time making your graph look pretty, you can focus on creating a graph that best reveals the messages in your data.

Table of Contents

Frontmatter

Chapter 1. Introduction

ggplot2 is an R package for producing statistical, or data, graphics, but it is unlike most other graphics packages because it has a deep underlying grammar. This grammar, based on the Grammar of Graphics (Wilkinson, 2005), is composed of a set of independent components that can be composed in many different ways. This makes ggplot2 very powerful, because you are not limited to a set of pre-specified graphics, but you can create new graphics that are precisely tailored for your problem.
Hadley Wickham

Chapter 2. Getting started with qplot

In this chapter, you will learn to make a wide variety of plots with your first ggplot2 function, qplot(), short for quick plot. qplot makes it easy to produce complex plots, often requiring several lines of code using other plotting systems, in one line. qplot() can do this because it’s based on the grammar of graphics, which allows you to create a simple, yet expressive, description of the plot. In later chapters you’ll learn to use all of the expressive power of the grammar, but here we’ll start simple so you can work your way up. You will also start to learn some of the ggplot2 terminology that will be used throughout the book.
Hadley Wickham

Chapter 3. Mastering the grammar

You can choose to use just qplot(), without any understanding of the underlying grammar, but if you do you will never be able to unlock the full power of ggplot2. By learning more about the grammar and its components, you will be able to create a wider range of plots, as well as being able to combine multiple sources of data, and customise to your heart’s content. You may want to skip this chapter in a first reading of the book, returning when you want a deeper understanding of how all the pieces fit together.
Hadley Wickham

Chapter 4. Build a plot layer by layer

Layering is the mechanism by which additional data elements are added to a plot. Each layer can come from a different dataset and have a different aesthetic mapping, allowing us to create plots that could not be generated using qplot(), which permits only a single dataset and a single set of aesthetic mappings.
Hadley Wickham

Chapter 5. Toolbox

The layered structure of ggplot2 encourages you to design and construct graphics in a structured manner. You have learned what a layer is and how to add one to your graphic, but not what geoms and statistics are available to help you build revealing plots. This chapter lists some of the many geoms and stats included in ggplot2, broken down by their purpose. This chapter will provide a good overview of the available options, but it does not describe each geom and stat in detail. For more information about individual geoms, along with many more examples illustrating their use, see the online and electronic documentation. You may also want to consult the documentation to learn more about the datasets used in this chapter.
Hadley Wickham

Chapter 6. Scales, axes and legends

Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can perceive visually: e.g., size, colour, position or shape. Scales also provide the tools you use to read the plot: the axes and legends (collectively known as guides). Formally, each scale is a function from a region in data space (the domain of the scale) to a region in aesthetic space (the range of the range). The domain of each scale corresponds to the range of the variable supplied to the scale, and can be continuous or discrete, ordered or unordered. The range consists of the concrete aesthetics that you can perceive and that R can understand: position, colour, shape, size and line type. If you blinked when you read that scales map data both to position and colour, you are not alone. The notion that the same kind of object is used to map data to positions and symbols strikes some people as unintuitive. However, you will see the logic and power of this notion as you read further in the chapter.
Hadley Wickham

Chapter 7. Positioning

This chapter discusses position, particularly how facets are laid out on a page, and how coordinate systems within a panel work. There are four components that control position.
Hadley Wickham

Chapter 8. Polishing your plots for publication

In this chapter you will learn how to prepare polished plots for publication. Most of this chapter focusses on the theming capability of ggplot2 which allows you to control many non-data aspects of plot appearance, but you will also learn how to adjust geom, stat and scale defaults, and the best way to save plots for inclusion into other software packages. Together with the next chapter, manipulating plot rendering with grid, you will learn how to control every visual aspect of the plot to get exactly the appearance that you want.
Hadley Wickham

Chapter 9. Manipulating data

So far this book has assumed you have your data in a nicely structured data frame ready to feed to ggplot() or qplot(). If this is not the case, then you’ll need to do some transformation.
Hadley Wickham

Chapter 10. Reducing duplication

A major requirement of a good data analysis is flexibility. If the data changes, or you discover something that makes you rethink your basic assumptions, you need to be able to easily change many plots at once. The main inhibitor of flexibility is duplication. If you have the same plotting statement repeated over and over again, you have to make the same change in many different places. Often just the thought of making all those changes is exhausting!
Hadley Wickham

Backmatter

Additional information

Premium Partner

    Image Credits