Skip to main content

Über dieses Buch

R for Business Analytics looks at some of the most common tasks performed by business analysts and helps the user navigate the wealth of information in R and its 4000 packages. With this information the reader can select the packages that can help process the analytical tasks with minimum effort and maximum usefulness. The use of Graphical User Interfaces (GUI) is emphasized in this book to further cut down and bend the famous learning curve in learning R. This book is aimed to help you kick-start with analytics including chapters on data visualization, code examples on web analytics and social media analytics, clustering, regression models, text mining, data mining models and forecasting. The book tries to expose the reader to a breadth of business analytics topics without burying the user in needless depth. The included references and links allow the reader to pursue business analytics topics.

This book is aimed at business analysts with basic programming skills for using R for Business Analytics. Note the scope of the book is neither statistical theory nor graduate level research for statistics, but rather it is for business analytics practitioners. Business analytics (BA) refers to the field of exploration and investigation of data generated by businesses. Business Intelligence (BI) is the seamless dissemination of information through the organization, which primarily involves business metrics both past and current for the use of decision support in businesses. Data Mining (DM) is the process of discovering new patterns from large data using algorithms and statistical methods. To differentiate between the three, BI is mostly current reports, BA is models to predict and strategize and DM matches patterns in big data. The R statistical software is the fastest growing analytics platform in the world, and is established in both academia and corporations for robustness, reliability and accuracy.

The book utilizes Albert Einstein’s famous remarks on making things as simple as possible, but no simpler. This book will blow the last remaining doubts in your mind about using R in your business environment. Even non-technical users will enjoy the easy-to-use examples. The interviews with creators and corporate users of R make the book very readable. The author firmly believes Isaac Asimov was a better writer in spreading science than any textbook or journal author.



Chapter 1. Why R

In this chapter we introduce the reader to R, discuss reasons for choosing R as an analytical and not just a statistical computing platform, make comparisons with other analytical software, and present some broad costs and benefits in using R in a business environment.

A. Ohri

Chapter 2. R Infrastructure

In this chapter we discuss the practical realities in setting up an analytical environment based on R, including hardware, software, budgeting, and training needs. We will also walk through the basics of installing R, R’s library of packages, updating R, and accessing the comprehensive user help.

A. Ohri

Chapter 3. R Interfaces

In this chapter we discuss the various ways to interface R and to use R analytics based on one’s needs. We will cover how to minimize the time spent learning to perform tasks in R by using a GUI instead of the command line. In addition, we will learn how to interface to R from other software as well as use it from an Amazon cloud computing environment. We will also discuss the relative merits and demerits of various R interfaces.

A. Ohri

Chapter 4. Manipulating Data

R has different types of data storage such as lists, arrays, and data frames. This can be confusing for some analysts with a pure background in handling rectangular datasets like data (with rows for records and variables for columns). The first and often the toughest or most time-consuming task in an analytical environment for a new project is getting the data loaded into the analytical software. This chapter discusses the techniques for reading in data from various formats. The two main methods of inputting data are through the command line and a GUI, and different packages for bigger datasets (¿1 GB) are discussed. In addition, obtaining data from various types of databases is specifically mentioned. Analyzing data can have many challenges associated with it. In the case of business analytics data, these challenges or constraints can have a marked effect on the quality and timeliness of the analysis as well as the expected versus actual payoff from the analytical results.

A. Ohri

Chapter 5. Exploring Data

While Chap. 4 dealt with getting your data in shape for processing (or, as it is commonly known, data preprocessing), in this chapter we actually start the process of looking at slices of data for generating various insights. We will emphasize the need for data visualization both as an acknowledgement of growing demands of data volume and easy understandability by business audiences. The fact that R currently has one of the most advanced graphical libraries also helps. We will be using basic graphical capabilities but will also briefly touch on advanced customization using the acclaimed ggplot2 package.

A. Ohri

Chapter 6. Building Regression Models

One of the most common uses of statistical software is for building models, specifically logistic regression models for propensity in the marketing of goods and services. Within the R Project, regression packages are shown in the documentation in both the Econometrics view—

—and the Finance view. A basic summary of all the R functions used for building regression models can be seen at

. A very good textbook on the basics of regression is

Practical Regression and Anova Using R

by Julian J. Faraway (available for free at


A. Ohri

Chapter 7. Data Mining Using R

Data mining is a commonly used term that is interchangeably used with business analytics, but it is not exactly the same.

A. Ohri

Chapter 8. Clustering and Data Segmentation

Cluster analysis is basically a data reduction technique to reduce a large number of objects in groups or clusters in such a manner that objects belonging to one group or cluster are more similar to each other and more different from objects in another group or cluster. Clustering is used in business analytics to identify groups of customers that can be targeted with similar products, to understand products and markets, and basically to reduce data for an actionable strategy especially in cases where data are not sufficiently clean or exhaustive to create predictive models.

A. Ohri

Chapter 9. Forecasting and Time Series Models

Time series are series in which some quantity or variable varies with respect to time intervals (in the form of months, weeks, days, hours, etc.). This basically implies that the future value of a particular variable is in some way related to its present value as well as to the time interval difference.

A. Ohri

Chapter 10. Data Export and Output

Data export, and saving results, graphs, and code are important to help complete the final documentation and presentation for an analytical project. What are the various formats available in R for exporting graphs? The function capabilities() can be used to obtain a list of exportable formats for graphs.

A. Ohri

Chapter 11. Optimizing R Code

As the previous chapters have shown, multiple techniques are available in R for powerful data-driven insights and analysis. For the average business analyst, well-designed GUID tools that are stable to use, pull data and models, and report them are essential, and all these are available within various R subcomponents and packages. This chapter is aimed at analysts wishing to tweak their overall R experience by measuring R performance and improving it using some of the well-known and some recently introduced utilities.

A. Ohri

Chapter 12. Additional Training Literature

Blogs, email help groups, and Web sites are important sources of training literature as well as tutorials. While choosing the mix of books, journal articles, blog posts, and online content is often a matter a personal choice, the reader should choose based on his or her own business or analytical needs.

A. Ohri

Chapter 13. Appendix

Google Analytics is the most widely used Web analytics software on the Internet, and using R we can do advanced analytics or build a custom Web analytics solution with it.

A. Ohri


Weitere Informationen

Premium Partner

micromStellmach & BröckersBBL | Bernsau BrockdorffMaturus Finance GmbHPlutahww hermann wienberg wilhelmAvaloq Evolution AG

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.



Blockchain-Effekte im Banking und im Wealth Management

Es steht fest, dass Blockchain-Technologie die Welt verändern wird. Weit weniger klar ist, wie genau dies passiert. Ein englischsprachiges Whitepaper des Fintech-Unternehmens Avaloq untersucht, welche Einsatzszenarien es im Banking und in der Vermögensverwaltung geben könnte – „Blockchain: Plausibility within Banking and Wealth Management“. Einige dieser plausiblen Einsatzszenarien haben sogar das Potenzial für eine massive Disruption. Ein bereits existierendes Beispiel liefert der Initial Coin Offering-Markt: ICO statt IPO.
Jetzt gratis downloaden!