Data Mining

Concepts, Models and Techniques

verfasst von: Florin Gorunescu

Verlag: Springer Berlin Heidelberg

Buchreihe : Intelligent Systems Reference Library

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The knowledge discovery process is as old as Homo sapiens. Until some time ago this process was solely based on the ‘natural personal' computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the 'artificial' computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since “knowledge is power”. The goal of this book is to provide, in a friendly way, both theoretical concepts and, especially, practical techniques of this exciting field, ready to be applied in real-world situations. Accordingly, it is meant for all those who wish to learn how to explore and analysis of large quantities of data in order to discover the hidden nugget of information.

Inhaltsverzeichnis

Frontmatter

Introduction to Data Mining

Abstract

It is the purpose of this chapter to introduce and explain fundamental aspects about data mining used throughout the present book. These are related to: what is data mining, why to use data mining, how to mine data? There are also discussed: data mining solvable problems, issues concerning the modeling process and models, main data mining applications, methodology and terminology used in data mining.

Florin Gorunescu

The “Data-Mine”

Abstract

Data mining deals with data. Basically, a huge amount of data is processed for extracting useful unknown patterns. Accordingly, we need more information concerning the “nugget of knowledge” -data- we are dealing with. This chapter is dedicated to a short review regarding some important issues concerning data: definition of data, types of data, data quality and types of data attributes.

Florin Gorunescu

Exploratory Data Analysis

Abstract

As we stated in the introductory chapter, data mining originates from many scientific areas, one of them being Statistics. Having in mind that data mining is an analytic process designed to explore large amounts of data in search of consistent and valuable hidden knowledge, the first step made in this fabulous research field consists in an initial data exploration. For building various models and choosing the best one, based on their predictive performance, it is necessary to perform a preliminary exploration of the data to better understand their characteristics. This stage usually starts with data preparation. Then, depending on the nature of the problem to be solved, it can involve anything from simple descriptive statistics to regression models, time series, multivariate exploratory techniques, etc. The aim of this chapter is therefore to provide an overview of the main topics concerning this data analysis.

Florin Gorunescu

Classification and Decision Trees

Abstract

One of the most popular classification techniques used in the data mining process is represented by the classification and decision trees. Because after accomplishing a classification process, a decision is naturally made, both labels are correctly inserted in its name, though they are usually used separately (i.e., classification trees or decision trees). From now on, we will call them just decision trees, since it represents the final goal of this model. The greatest benefit to use decision trees is given by both their flexibility and understandability. This chapter will present a short overview concerning the main steps in building and applying a decision tree in real-life problems.

Florin Gorunescu

Data Mining Techniques and Models

Abstract

Data mining can also be viewed as a process of model building, and thus the data used to build the model can be understood in ways that we may not have previously taken into consideration. This chapter summarizes some well-known data mining techniques and models, such as: Bayesian classifier, association rule mining and rule-based classifier, artificial neural networks, k-nearest neighbors, rough sets, clustering algorithms, and genetic algorithms. Thus, the reader will have a more complete view on the tools that data mining borrowed from different neighboring fields and used in a smart and efficient manner for digging in data for hidden knowledge.

Florin Gorunescu

Classification Performance Evaluation

Abstract

A great part of this book presented the fundamentals of the classification process, a crucial field in data mining. It is now the time to deal with certain aspects of the way in which we can evaluate the performance of different classification (and decision) models. The problem of comparing classifiers is not at all an easy task. There is no single classifier that works best on all given problems, phenomenon related to the ”No-free-lunch” metaphor, i.e., each classifier (’restaurant’) provides a specific technique associated with the corresponding costs (’menu’ and ’price’ for it). It is hence up to us, using the information and knowledge at hand, to find the optimal trade-off.

Florin Gorunescu

Backmatter

Titel: Data Mining
verfasst von: Florin Gorunescu
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-19721-5
Print ISBN: 978-3-642-19720-8
DOI: https://doi.org/10.1007/978-3-642-19721-5

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Introduction to Data Mining

The “Data-Mine”

Exploratory Data Analysis

Classification and Decision Trees

Data Mining Techniques and Models

Classification Performance Evaluation

Backmatter

Premium Partner