Skip to main content

2015 | Buch

An Introduction to Machine Learning

insite
SUCHEN

Über dieses Buch

This book presents basic ideas of machine learning in a way that is easy to understand, by providing hands-on practical advice, using simple examples, and motivating students with discussions of interesting applications. The main topics include Bayesian classifiers, nearest-neighbor classifiers, linear and polynomial classifiers, decision trees, neural networks, and support vector machines. Later chapters show how to combine these simple tools by way of “boosting,” how to exploit them in more complicated domains, and how to deal with diverse advanced practical issues. One chapter is dedicated to the popular genetic algorithms.

Inhaltsverzeichnis

Frontmatter
Chapter 1. A Simple Machine-Learning Task
Abstract
You will find it difficult to describe your mother’s face accurately enough for your friend to recognize her in a supermarket. But if you show him a few of her photos, he will immediately spot the tell-tale traits he needs. As they say, a picture—an example—is worth a thousand words.
Miroslav Kubat
Chapter 2. Probabilities: Bayesian Classifiers
Abstract
The earliest attempts to predict an example’s class based on the known attribute values go back to well before World War II—prehistory, by the standards of computer science. Of course, nobody used the term “machine learning,” in those days, but the goal was essentially the same as the one addressed in this book.
Miroslav Kubat
Chapter 3. Similarities: Nearest-Neighbor Classifiers
Abstract
Two plants that look very much alike probably represent the same species; likewise, it is quite common that patients complaining of similar symptoms suffer from the same disease. In short, similar objects often belong to the same class—an observation that forms the basis of a popular approach to classification: when asked to determine the class of object x, find the training example most similar to it. Then label x with this example’s class.
Miroslav Kubat
Chapter 4. Inter-Class Boundaries: Linear and Polynomial Classifiers
Abstract
When representing the training examples with points in an n-dimensional instance space, we may realize that positive examples tend to be clustered in regions different from those occupied by negative examples. This observation motivates yet another approach to classification. Instead of the probabilities and similarities employed by the earlier paradigms, we can try to identify the decision surface that separates the two classes. A very simple possibility is to use to this end a linear function. More flexible are high-order polynomials which are capable of defining very complicated inter-class boundaries. These, however, have to be handled with care.
Miroslav Kubat
Chapter 5. Artificial Neural Networks
Abstract
Polynomial classifiers can model decision surfaces of any shape; and yet their practical utility is limited because of the easiness with which they overfit noisy training data, and because of the sometimes impractically high number of trainable parameters. Much more popular are artificial neural networks where many simple units, called neurons, are interconnected by weighted links into larger structures of remarkably high performance.
Miroslav Kubat
Chapter 6. Decision Trees
Abstract
The classifiers discussed in the previous chapters expect all attribute values to be presented at the same time. Such a scenario, however, has its flaws. Thus a physician seeking to come to grips with the nature of her patient’s condition often has nothing to begin with save a few subjective symptoms. And so, to narrow the field of diagnoses, she prescribes lab tests, and, based on the results, perhaps other tests still. At any given moment, then, the doctor considers only “attributes” that promis to add meaningfully to her current information or understanding. It would be absurd to ask for all possible lab tests (thousands and thousands of them) right from the start.
Miroslav Kubat
Chapter 7. Computational Learning Theory
Abstract
As they say, nothing is more practical than a good theory. And indeed, mathematical models of learnability have helped improve our understanding of what it takes to induce a useful classifier from data, and, conversely, why the outcome of a machine-learning undertaking so often disappoints. And so, even though this textbook does not want to be mathematical, it cannot help introducing at least the basic concepts of the computational learning theory.
Miroslav Kubat
Chapter 8. A Few Instructive Applications
Abstract
For someone wishing to become an expert on machine learning, mastering a handful of baseline techniques is not enough. Far from it. The world lurking behind a textbook’s toy domains has a way of complicating things, frustrating the engineer with unexpected obstacles, and challenging everybody’s notion of what exactly the induced classifier is supposed to do and why. Just as in any other field of technology, success is hard to achieve without a healthy dose of creativity.
Miroslav Kubat
Chapter 9. Induction of Voting Assemblies
Abstract
A popular way of dealing with difficult problems is to organize a brainstorming session in which specialists from different fields share their knowledge, offering diverse points of view that complement each other to the point where they may inspire innovative solutions. Something similar can be done in machine learning, too. A group of classifiers is created in a way that makes each of them somewhat different. When they vote about the recommended class, their “collective wisdom” often compensates for each individual’s imperfections.
Miroslav Kubat
Chapter 10. Some Practical Aspects to Know About
Abstract
The engineer who wants to avoid disappointments has to be aware of certain machine-learning aspects that, for the sake of clarity, our introduction to the basic techniques had to neglect. To present some of the most important ones is the task for this chapter.
Miroslav Kubat
Chapter 11. Performance Evaluation
Abstract
The previous chapters pretended that performance evaluation in machine learning is a fairly straightforward matter. All it takes is to apply the induced classifier to a set of examples whose classes are known, and then count the number of errors the classifier has made. In reality, things are not as simple. Error rate rarely paints the whole picture, and there are situations in which it can even be misleading. This is why the conscientious engineer wants to be acquainted with other criteria to assess the classifiers’ performance. This knowledge will enable her to choose the one that is best in capturing the behavioral aspects of interest.
Miroslav Kubat
Chapter 12. Statistical Significance
Abstract
Suppose you have evaluated a classifier’s performance on an independent testing set. To what extent can you trust your findings? When a flipped coin comes up heads eight times out of ten, any reasonable experimenter will suspect this to be nothing but a fluke, expecting that another set of ten tosses will give a result closer to reality. Similar caution is in place when measuring classification performance. To evaluate classification accuracy on a testing set is not enough; just as important is to develop some notion of the chances that the measured value is a reliable estimate of the classifier’s true behavior.
Miroslav Kubat
Chapter 13. The Genetic Algorithm
Abstract
The essence of machine learning is the search for the best solution to our problem: to find a classifier which classifies as correctly as possible not only the training examples, but also future examples. Chapter 1 explained the principle of one of the most popular AI-based search techniques, the so-called hill-climbing, and showed how it can be used in classifier induction.
Miroslav Kubat
Chapter 14. Reinforcement Learning
Abstract
The fundamental problem addressed by this book is how to induce a classifier capable of determining the class of an object. We have seen quite a few techniques that have been developed with this in mind. In reinforcement learning, though, the task is different. Instead of induction from a set of pre-classified examples, the agent “experiments” with a system, and the system responds to this experimentation with rewards or punishments. The agent then optimizes its behavior, its goal being to maximize the rewards and to minimize the punishments.
Miroslav Kubat
Backmatter
Metadaten
Titel
An Introduction to Machine Learning
verfasst von
Miroslav Kubat
Copyright-Jahr
2015
Electronic ISBN
978-3-319-20010-1
Print ISBN
978-3-319-20009-5
DOI
https://doi.org/10.1007/978-3-319-20010-1