Top

2023 | Book

Read chapter Read first chapter

Applied Linear Regression for Business Analytics with R

A Practical Guide to Data Science with Case Studies

Author: Daniel P. McGibney

Publisher: Springer International Publishing

Book Series : International Series in Operations Research & Management Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

Applied Linear Regression for Business Analytics with R introduces regression analysis to business students using the R programming language with a focus on illustrating and solving real-time, topical problems. Specifically, this book presents modern and relevant case studies from the business world, along with clear and concise explanations of the theory, intuition, hands-on examples, and the coding required to employ regression modeling. Each chapter includes the mathematical formulation and details of regression analysis and provides in-depth practical analysis using the R programming language.

Frontmatter

Chapter 1. Introduction

Abstract

Business analytics uses modern computing methods to report, enhance, and provide insights into modern businesses. Regression analysis does these actions through data by predicting unknown values, assessing differences among groups, and checking the relationships among variables. When regression analysis is applied to the right data set in the right way, the results can make businesses extremely profitable, whether the objective is predicting the sale price of houses, assessing marketing methods, or predicting the number of likes on a social media post. This book has countless applications of business examples where regression analysis produces valuable insights. This chapter begins with a discussion of the history of regression analysis and its role in data science, machine learning, and artificial intelligence (AI). Also, we will provide an overview of each of the eight case studies in this book. These cases offer detailed analyses of how to use regression analysis to obtain actionable business findings.

Daniel P. McGibney

Chapter 2. Basic Statistics and Functions Using R

Abstract

Data science represents a multifaceted discipline, since it requires knowledge from statistics to understand the data, knowledge from programming to manipulate the data, and the know-how to explain the data, which is often best done with one or more visualizations. Beyond statistics as the general subject matter within the branch of mathematics, the word “statistics” carries a second definition referring to the numeric values that summarize a sample, such as the mean, median, standard deviation, and variance. The R programming language signifies a preferred choice among statisticians and data scientists to easily manipulate data and provide useful statistics on that data. R has many popular plots, but here we will focus on three of the most basic ones, which are necessary for the study of linear regression.

Daniel P. McGibney

Chapter 3. Regression Fundamentals

Abstract

With a recent expansion of information collection and storage, businesses increasingly turn to classical analyses of data. In particular, linear regression analysis, while developed more than 200 years ago, remains a fundamental concept in statistics and business analytics. Linear regression is at the heart of many predictive methods, including modern machine learning models.

Daniel P. McGibney

Chapter 4. Simple Linear Regression

Abstract

In Albert Einstein’s quote above, he stresses the paramount importance of simplicity. In regression analysis, focusing on only two variables demonstrates the concepts simply. Thus, in Chap. 2, we calculated the least squares line by using two variables. We also plotted scatterplots and calculated correlation coefficients to further assess the linear relationship. From this analysis, we obtained a detailed understanding of the relationship between two variables. Upon understanding a linear relationship, other more complicated processes become easier to grasp.

Daniel P. McGibney

Chapter 5. Multiple Regression

Abstract

In this chapter, we build upon the coverage of regression analysis by considering situations involving two or more predictor variables. For instance, while the weight of a person may be predicted using their height, we could use both the height and age of that person to predict their weight. Using more than one predictor variable to predict a response is called multiple regression analysis, which enables us to consider more predictor variables and thus obtain better estimates than those possible with simple linear regression.

Daniel P. McGibney

Chapter 6. Estimation Intervals and Analysis of Variance

Abstract

In some cases, a point estimate alone proves insufficient and requires a confidence interval. Hence, we expand our coverage to explore confidence intervals for the mean response and prediction about the predicted value of the response. Chapters 2 and 3 introduced the fundamental concepts behind sum of squares and the explained and unexplained components. Here, we look at these fundamental concepts more in depth and in the process provide additional analysis techniques.

Daniel P. McGibney

Chapter 7. Predictor Variable Transformations

Abstract

In this chapter, we discuss transformations of predictor variables. One popular transformation consists of dummy variables, which are variables that allow for the effect of categorical variables to be considered in regression modeling. Dummy variables can be used in regression analysis as both predictor and response variables, but we will limit our discussion to predictor variables. Using dummy variables as the response variable is often referred to as classification, which will remain outside of the scope of this book. In previous chapters, we assumed linear models with untransformed predictor variables. Here, we introduce nonlinear transformations of predictor variables, thereby, making the model linear.

Daniel P. McGibney

Chapter 8. Model Diagnostics

Abstract

Knowing the correct mathematical model can be tremendously helpful when trying to make predictions. In many cases, however, a mathematical model that is relatively close to the true state may suffice. Yet this process brings its challenges. The quote from Mark Twain shown above alludes to the difficulty of attempting to find a mathematical model that is acceptable enough.

Daniel P. McGibney

Chapter 9. Variable Selection

Abstract

While it may be a bit strict to say that all models are wrong, it is often the case that a model is imperfect. However, an imperfect model may still provide a great amount of value. When attempting to find the best model from the data given, being able to select the predictor variables is of utmost importance in the model building process. In fact, one of the most important aspects of model creation is knowing which predictor variables to use, a process sometimes called feature selection or variable selection. Variable selection can be tremendously helpful when an analyst is attempting to find a mathematical model that is relatively close to the true state.

Daniel P. McGibney

Backmatter

Title: Applied Linear Regression for Business Analytics with R
Author: Daniel P. McGibney
Publisher: Springer International Publishing
Electronic ISBN: 978-3-031-21480-6
Print ISBN: 978-3-031-21479-0
DOI: https://doi.org/10.1007/978-3-031-21480-6

Springer Professional

About this book

Table of Contents

Frontmatter

Chapter 1. Introduction

Chapter 2. Basic Statistics and Functions Using R

Chapter 3. Regression Fundamentals

Chapter 4. Simple Linear Regression

Chapter 5. Multiple Regression

Chapter 6. Estimation Intervals and Analysis of Variance

Chapter 7. Predictor Variable Transformations

Chapter 8. Model Diagnostics

Chapter 9. Variable Selection

Backmatter