Top

2017 | Book

Read chapter Read first chapter

Predictive Data Mining Models

Authors: David L. Olson, Desheng Wu

Publisher: Springer Singapore

Book Series : Computational Risk Management

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book reviews forecasting data mining models, from basic tools for stable data through causal models, to more advanced models using trends and cycles. These models are demonstrated on the basis of business-related data, including stock indices, crude oil prices, and the price of gold. The book’s main approach is above all descriptive, seeking to explain how the methods concretely work; as such, it includes selected citations, but does not go into deep scholarly reference. The data sets and software reviewed were selected for their widespread availability to all readers with internet access.

Frontmatter

Chapter 1. Knowledge Management

Abstract

Knowledge management is an overarching term referring to the ability to identify, store, and retrieve knowledge. Identification requires gathering the information needed and to analyze available data to make effective decisions regarding whatever the organization does. This include research, digging through records, or gathering data from wherever it can be found. Storage and retrieval of data involves database management, using many tools developed by computer science. Thus knowledge management involves understanding what knowledge is important to the organization, understanding systems important to organizational decision making, database management, and analytic tools of data mining.

David L. Olson, Desheng Wu

Chapter 2. Data Sets

Abstract

Data comes in many forms. The current age of big data floods us with numbers accessible from the Web. We have trading data available in real time (which caused some problems with automatic trading algorithms, so some trading sites impose a delay of 20 min or so to make this data less real-time). Wal-Mart has real-time data from its many cash registers enabling it to automate intelligent decisions to manage its many inventories. Currently a wrist device called a Fitbit is very popular, enabling personal monitoring of individual health numbers, which have the ability to be shared in real-time with physicians or ostensibly EMT providers. The point is that there is an explosion of data in our world.

David L. Olson, Desheng Wu

Chapter 3. Basic Forecasting Tools

Abstract

We will present two fundamental time series forecasting tools. Moving average is a very simple approach, presented because it is a component of ARIMA models to be covered in a future chapter. Regression is a basic statistical tool. In data mining, it is one of the basic tools for analysis, used in classification applications through logistic regression and discriminant analysis, as well as prediction of continuous data through ordinary least squares (OLS) and other forms. As such, regression is often taught in one (or more) three-hour courses.

David L. Olson, Desheng Wu

Chapter 4. Multiple Regression

Abstract

Regression models allow you to include as many independent variables as you want. In traditional regression analysis, there are good reasons to limit the number of variables. The spirit of exploratory data mining, however, encourages examining a large number of independent variables. Here we are presenting very small models for demonstration purposes. In data mining applications, the assumption is that you have very many observations, so that there is no technical limit on the number of independent variables.

David L. Olson, Desheng Wu

Chapter 5. Regression Tree Models

Abstract

Decision trees are models that process data to split it in strategic places to divide the data into groups with high probabilities of one outcome or another. It is especially effective at data with categorical outcomes, but can also be applied to continuous data, such as the time series we have been considering. Decision trees consist of nodes, or splits in the data defined as particular cutoffs for a particular independent variable, and leaves, which are the outcome. For categorical data, the outcome is a class. For continuous data, the outcome is a continuous number, usually some average measure of the dependent variable.

David L. Olson, Desheng Wu

Chapter 6. Autoregressive Models

Abstract

Autoregressive models take advantage of the correlation between errors across time periods. Basic linear regression views this autocorrelation as a negative statistical property, a bias in error terms. Such bias often arises in cyclical data, where if the stock market price was high yesterday, it likely will be high today, as opposed to a random walk kind of characteristic where knowing the error of the last forecast should say nothing about the next error. Traditional regression analysis sought to wash out the bias from autocorrelation. Autoregressive models, to the contrary, seek to utilize this information to make better forecasts. It doesn’t always work, but if there are high degrees of autocorrelation, autoregressive models can provide better forecasts.

David L. Olson, Desheng Wu

Chapter 7. Classification Tools

Abstract

Data mining uses a variety of modeling tools for a variety of purposes. Various authors have viewed these purposes along with available tools (see Table 7.1). There are many other specific methods used as well.

David L. Olson, Desheng Wu

Chapter 8. Predictive Models and Big Data

Abstract

Data mining has proven valuable in almost every academic discipline. Understanding business application of data mining is necessary to expose business college students to current analytic information technology.

David L. Olson, Desheng Wu

Backmatter

Title: Predictive Data Mining Models
Authors: David L. Olson
Desheng Wu
Publisher: Springer Singapore
Electronic ISBN: 978-981-10-2543-3
Print ISBN: 978-981-10-2542-6
DOI: https://doi.org/10.1007/978-981-10-2543-3

Springer Professional

Predictive Data Mining Models

About this book

Table of Contents

Frontmatter

Chapter 1. Knowledge Management

Chapter 2. Data Sets

Chapter 3. Basic Forecasting Tools

Chapter 4. Multiple Regression

Chapter 5. Regression Tree Models

Chapter 6. Autoregressive Models

Chapter 7. Classification Tools

Chapter 8. Predictive Models and Big Data

Backmatter

Premium Partner