nach oben

2021 | Buch

Machine Learning

verfasst von: Zhi-Hua Zhou

Verlag: Springer Singapore

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Machine Learning, a vital and core area of artificial intelligence (AI), is propelling the AI field ever further and making it one of the most compelling areas of computer science research. This textbook offers a comprehensive and unbiased introduction to almost all aspects of machine learning, from the fundamentals to advanced topics. It consists of 16 chapters divided into three parts: Part 1 (Chapters 1-3) introduces the fundamentals of machine learning, including terminology, basic principles, evaluation, and linear models; Part 2 (Chapters 4-10) presents classic and commonly used machine learning methods, such as decision trees, neural networks, support vector machines, Bayesian classifiers, ensemble methods, clustering, dimension reduction and metric learning; Part 3 (Chapters 11-16) introduces some advanced topics, covering feature selection and sparse learning, computational learning theory, semi-supervised learning, probabilistic graphical models, rule learning, and reinforcement learning. Each chapter includes exercises and further reading, so that readers can explore areas of interest.

The book can be used as an undergraduate or postgraduate textbook for computer science, computer engineering, electrical engineering, data science, and related majors. It is also a useful reference resource for researchers and practitioners of machine learning.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract

Following a drizzling, we take a walk on the wet street. Feeling the gentle breeze and seeing the sunset glow, we bet the weather must be nice tomorrow. Walking to a fruit stand, we pick up a green watermelon with curly root and muffled sound; while hoping the watermelon is ripe, we also expect some good academic marks this semester after all the hard work on studies. We wish readers to share the same confidence in their studies, but to begin with, let us take an informal discussion on what is machine learning.

Zhi-Hua Zhou

Chapter 2. Model Selection and Evaluation

Abstract

In general, the proportion of incorrectly classified samples to the total number of samples is called error rate, that is, if a out of m samples are misclassified, then the error rate is \(E=a/m\).

Zhi-Hua Zhou

Chapter 3. Linear Models

Abstract

Let \(\boldsymbol{x} = (x_1;x_2;\ldots ;x_d)\) be a sample described by d variables, where \(\boldsymbol{x}\) takes the value \(x_i\) on the ith variable.

Zhi-Hua Zhou

Chapter 4. Decision Trees

Abstract

Decision trees are a popular class of machine learning methods. Taking binary classification as an example, we can regard the task as deciding the answer to the question Is this instance positive? As the name suggests, a decision tree makes decisions based on tree structures, which is also a common decision-making mechanism used by humans. For example, in order to answer the question Is this watermelon ripe? we usually go through a series of judgments or sub-decisions: we first consider What is the color? If it is green then What is the shape of root? If it is curly then What is the knocking sound? Finally, based on the observations, we decide whether the watermelon is ripe or not. Such a decision process is illustrated in Figure 4.1.

Zhi-Hua Zhou

Chapter 5. Neural Networks

Abstract

Research on neural networks started quite a long time ago, and it has become a broad and interdisciplinary research field today. Though neural networks have various definitions across disciplines, this book uses a widely adopted one: “Artificial neural networks are massively parallel interconnected networks of simple (usually adaptive) elements and their hierarchical organizations which are intended to interact with the objects of the real world in the same way as biological nervous systems do” (Kohonen 1988). In the context of machine learning, neural networks refer to “neural networks learning”, or in other words, the intersection of machine learning research and neural networks research.

Zhi-Hua Zhou

Chapter 6. Support Vector Machine

Abstract

Given a training set \(D = \{(\boldsymbol{x}_1, y_1), (\boldsymbol{x}_2, y_2), \ldots , (\boldsymbol{x}_m, y_m)\}\), where \(y_i\in \{-1,+1\}\). The basic idea of classification is to utilize the training set D to find a separating hyperplane in the sample space that can separate samples of different classes. However, there could be multiple qualified separating hyperplanes, as shown in Figure 6.1, which one should be chosen?

Zhi-Hua Zhou

Chapter 7. Bayes Classifiers

Abstract

Bayesian decision theory is a fundamental decision-making approach under the probability framework. In an ideal situation when all relevant probabilities were known, Bayesian decision theory makes optimal classification decisions based on the probabilities and costs of misclassifications. In the following, we demonstrate the basic idea of Bayesian decision theory with multiclass classification.

Zhi-Hua Zhou

Chapter 8. Ensemble Learning

Abstract

Ensemble learning, also known as multiple classifier system and committee-based learning, trains and combines multiple learners to solve a learning problem.

Zhi-Hua Zhou

Chapter 9. Clustering

Abstract

Unsupervised learning aims to discover underlying properties and patterns from unlabeled training samples and lays the foundation for further data analysis. Among various unsupervised learning techniques, the most researched and applied one is clustering.

Other unsupervised learning tasks include density estimation, anomaly detection, etc.

Zhi-Hua Zhou

Chapter 10. Dimensionality Reduction and Metric Learning

Abstract

k-Nearest Neighbor (kNN) is a commonly used supervised learning method with a simple mechanism: given a testing sample, find the k nearest training samples based on some distance metric, and then use these k ‘‘neighbors” to make predictions. Typically, for classification problems, voting can be used to predict the testing sample as the most frequent class label in the k neighbors; for regression problems, averaging can be used to predict the testing sample as the average of the k real-valued outputs. Besides, the samples can be weighted by the distances in the way that a closer sample is assigned a higher weight.

The principle of kNN agrees with the proverb that ‘‘One takes the behavior of one’s company.”

See Sect. 8.4.

Zhi-Hua Zhou

Chapter 11. Feature Selection and Sparse Learning

Abstract

Watermelons can be described by many attributes, such as color, root, sound, texture, and surface, but experienced people can determine the ripeness with only the root and sound information. In other words, not all attributes are equally important for the learning task. In machine learning, attributes are also called features. Features that are useful for the current learning task are called relevant features, and those useless ones are called irrelevant features. The process of selecting relevant features from a given feature set is called feature selection.

Zhi-Hua Zhou

Chapter 12. Computational Learning Theory

Abstract

As the name suggests, computational learning theory is about ‘‘learning”’ by ‘‘computation” and is the theoretical foundation of machine learning. It aims to analyze the difficulties of learning problems, provides theoretical guarantees for learning algorithms, and guides the algorithm design based on theoretical analysis.

Zhi-Hua Zhou

Chapter 13. Semi-Supervised Learning

Abstract

We come to the watermelon field during the harvest season, and the ground is covered with many watermelons. The melon farmer brings a handful of melons and says that they are all ripe melons, and then points at a few melons in the ground and says that these are not ripe, and they would take a few more days to grow up. Based on this information, can we build a model to determine which melons in the field are ripe for picking? For sure, we can use the ripe and unripe watermelons told by the farmers as positive and negative samples to train a classifier. However, is it too few to use only a handful of melons as training samples? Can we use all the watermelons in the field as well?

Zhi-Hua Zhou

Chapter 14. Probabilistic Graphical Models

Abstract

The most important problem in machine learning is to estimate and infer the value of unknown variables (e.g., class label) based on the observed evidence (e.g., training samples). Probabilistic models provide a framework that considers learning problems as computing the probability distributions of variables.

Zhi-Hua Zhou

Chapter 15. Rule Learning

Abstract

In machine learning, rules usually refer to logic rules in the form of ‘‘if \(\ldots ,\) then \(\ldots \)” that can describe regular patterns or domain concepts with clear semantics (Fürnkranz et al. 2012). Rule learning is about learning a set of rules from training data for predicting unseen samples.

Broadly speaking, all predictive models can be seen as one or a set of rules. In rule learning, we refer to logic rules with the term ‘‘logicn” omitted.

Zhi-Hua Zhou

Chapter 16. Reinforcement Learning

Abstract

Planting watermelon involves many steps, such as seed selection, regular watering, fertilization, weeding, and insect control. We usually do not know the quality of the watermelons until harvesting. If we consider the harvesting of ripe watermelons as a reward for planting watermelons, then we do not receive the final reward immediately after each step of planting, e.g., fertilization. We do not even know the exact impact of the current action on the final reward. Instead, we only receive feedback about the current status, e.g., the watermelon seedling looks healthier.

Zhi-Hua Zhou

Backmatter

Titel: Machine Learning
verfasst von: Zhi-Hua Zhou
Verlag: Springer Singapore
Electronic ISBN: 978-981-15-1967-3
Print ISBN: 978-981-15-1966-6
DOI: https://doi.org/10.1007/978-981-15-1967-3

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Chapter 2. Model Selection and Evaluation

Chapter 3. Linear Models

Chapter 4. Decision Trees

Chapter 5. Neural Networks

Chapter 6. Support Vector Machine

Chapter 7. Bayes Classifiers

Chapter 8. Ensemble Learning

Chapter 9. Clustering

Chapter 10. Dimensionality Reduction and Metric Learning

Chapter 11. Feature Selection and Sparse Learning

Chapter 12. Computational Learning Theory

Chapter 13. Semi-Supervised Learning

Chapter 14. Probabilistic Graphical Models

Chapter 15. Rule Learning

Chapter 16. Reinforcement Learning

Backmatter

Premium Partner