Top

2019 | Book

Neural Networks and Statistical Learning

Authors: Dr. Ke-Lin Du, Prof. Dr. M. N. S. Swamy

Publisher: Springer London

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book provides a broad yet detailed introduction to neural networks and machine learning in a statistical framework. A single, comprehensive resource for study and further research, it explores the major popular neural network models and statistical learning approaches with examples and exercises and allows readers to gain a practical working understanding of the content. This updated new edition presents recently published results and includes six new chapters that correspond to the recent advances in computational learning theory, sparse coding, deep learning, big data and cloud computing.

Each chapter features state-of-the-art descriptions and significant research findings. The topics covered include:

• multilayer perceptron;
• the Hopfield network;
• associative memory models;• clustering models and algorithms;
• t he radial basis function network;
• recurrent neural networks;
• nonnegative matrix factorization;
• independent component analysis;
•probabilistic and Bayesian networks; and
• fuzzy sets and logic.

Focusing on the prominent accomplishments and their practical aspects, this book provides academic and technical staff, as well as graduate students and researchers with a solid foundation and comprehensive reference on the fields of neural networks, pattern recognition, signal processing, and machine learning.

Frontmatter

Chapter 1. Introduction

This chapter gives a brief introduction to the history of neural networks and machine learning. The concepts related to neurons, neural networks, and neural network processors are also described. This chapter concludes with an outline of the book.

Ke-Lin Du, M. N. S. Swamy

Chapter 2. Fundamentals of Machine Learning

This chapter deals with the fundamental concepts and theories of machine learning. It first introduces various learning and inference methods, followed by learning and generalization, model selection, and neural networks as universal machines. Some other important topics are also covered.

Ke-Lin Du, M. N. S. Swamy

Chapter 3. Elements of Computational Learning Theory

PAC learning theory is the foundation of computational learning theory. VC-dimension, Rademacher complexity, and empirical risk-minimization principle are three concepts for deriving a generalization error bound for a trained machine. The fundamental theorem of learning theory relates PAC learnability, VC-dimension, and empirical risk-minimization principle. Another basic theorem in computational learning theory is no-free-lunch theorem. These topics are addressed in this chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 4. Perceptrons

This chapter introduces the simplest form of neural network—the perceptron. The perceptron has its historical position in the discipline of neural network and machine learning. One-neuron perceptron and single-layer perctron are described, together with various training methods.

Ke-Lin Du, M. N. S. Swamy

Chapter 5. Multilayer Perceptrons: Architecture and Error Backpropagation

Multilayer perceptron is one of the most important neural network models. It is a universal approximator for any continuous multivariate function. This chapter centers on the multilayer perceptron model, and the backpropagation learning algorithm. Some related topics, such as network architecture optimization, learning speedup strategies, and first-order gradient-based learning algorithms, are also introduced.

Ke-Lin Du, M. N. S. Swamy

Chapter 6. Multilayer Perceptrons: Other Learing Techniques

This chapter continues to deal with multilayer perceptron. But the focus is on various second-order learning methods to speed up the learning process. Complex-valued multilayer perceptrons and spiking neural networks are also introduced in this chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 7. Hopfield Networks, Simulated Annealing, and Chaotic Neural Networks

Hopfield model is the most popular dynamic model. Simulated annealing, inspired by annealing in metallurgy, is a metaheuristic to approximate global optimization in a large search space. The annealing concept is widely used in the training of recurrent neural networks. Chaotic neural networks are recurrent neural networks introduced with chaotic dynamics. The cellular network is a generalization of the Hopfield network to a two- or higher dimensional array of cells. This chapter is dedicated to these topics. They are widely used for solving combinatorial optimization problems.

Ke-Lin Du, M. N. S. Swamy

Chapter 8. Associative Memory Networks

In the brain, knowledge is learnt by associating different types of sensory data. Associative memory is a fundamental function of human brain. It can be realized with neural networks with backward connections. Neural networks for associate memory and their learning algorithms are introduced in this chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 9. Clustering I: Basic Clustering Models and Algorithms

Clustering is an unsupervised classification technique that identifies some inherent structure present in a set of objects based on a similarity measure. Clustering methods can be derived from statistical models or competitive learning and correspondingly they can be classified into generative (or model-based) and discriminative (or similarity-based) approaches. A clustering problem can also be modeled as a COP. Clustering neural networks are statistical models, where a probability density function (pdf) for data is estimated by learning its parameters. In this chapter, our emphasis is placed on a number of competitive learning-based neural networks and clustering algorithms. We describe the SOM, learning vector quantizationVector quantization (LVQ), and ART models, as well as C-means, subtractive, and fuzzy clustering algorithms.

Ke-Lin Du, M. N. S. Swamy

Chapter 10. Clustering II: Topics in Clustering

Following Chap. 9 , this chapter continues to deal with clustering. We describe many associated topics such as the underutilization problem, robust clustering, hierarchical clustering, and cluster validity. Kernel-based clustering is introduced in Chap. 20 .

Ke-Lin Du, M. N. S. Swamy

Chapter 11. Radial Basis Function Networks

Next to multilayer perceptron, the RBF network is another popular feedforward neural network. This chapter is dedicated to the RBF network and its learning. A comparison with multilayer perceptron is also given.

Ke-Lin Du, M. N. S. Swamy

Chapter 12. Recurrent Neural Networks

Recurrent networks are neural networks with backward connections. They are dynamical systems with temporal state representations. They are used in many temporal processing models and applications. This chapter deals with recurrent networks and their learning.

Ke-Lin Du, M. N. S. Swamy

Chapter 13. Principal Component Analysis

Subspace learning techniques project high-dimensional data onto low-dimensional spaces. They are typically unsupervised. Well-known subspace learning algorithms are PCA, ICA, locality-preserving projection, and NMF. Discriminant analysis is a supervised subspace learning method and uses the data class label information. PCA is a classical statistical method for signal processing and data analysis. It is a feature extractor in the neural network processing setting, and is related to eigenvalue decomposition and singular value decomposition. This chapter introduces PCA, and the associated methods such as minor component analysis, generalized eigenvalue decomposition, singular value decomposition, factor analysis, and canonical correlation analysis.

Ke-Lin Du, M. N. S. Swamy

Chapter 14. Nonnegative Matrix Factorization

Low-rank matrix factorization or factor analysis is an important task that is helpful in the analysis of high-dimensional real-world data such as dimension reduction, data compression, feature extraction, and information retrieval. Nonnegative matrix factorization is a special low-rank factorization technique for nonnegative data. This chapter is dedicated to nonnegative matrix factorization. Other matrix decomposition methods, such as Nystrom method and CUR matrix decomposition, are also introduced in this chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 15. Independent Component Analysis

Blind source separation is a basic topic in signal and image processing. Independent component analysis is a basic solution to blind source separation. This chapter introduces blind source separation, with importance attached to independent component analysis. Some methods related to source separation for time series are also mentioned.

Ke-Lin Du, M. N. S. Swamy

Chapter 16. Discriminant Analysis

Discriminant analysis plays an important role in statistical pattern recognition. LDA, originally derived by Fisher, is one of the most popular discriminant analysis techniques. Under the assumption that the class distributions are identically distributed Gaussians, LDA is Bayes optimalBayes optimal. Like PCA, LDA is widely applied to image retrieval, face recognition, information retrieval, and pattern recognition. This chapter is dedicated to discriminant analysis.

Ke-Lin Du, M. N. S. Swamy

Chapter 17. Reinforcement Learning

One of the primary goals of AI is to produce fully autonomous agents that learn optimal behaviors through trial and error by interacting with their environments. The reinforcement learning paradigm is essentially learning through interaction. It has its root in behaviorist psychology. Reinforcement learning is influenced by optimal control, which is underpinned by mathematical dynamic programming formalism. This chapter deals with reinforcement learning.

Ke-Lin Du, M. N. S. Swamy

Chapter 18. Compressed Sensing and Dictionary Learning

Sparse coding is a matrix factorization technique. It models a target signal as a sparse linear combination of atoms (elementary signals) drawn from a dictionary (a fixed collection). Sparse coding has become a popular paradigm in signal processing, statistics, and machine learning. This chapter introduces compressed sensing, sparse representation/sparse coding, tensor compressed sensing, and sparse PCA.

Ke-Lin Du, M. N. S. Swamy

Chapter 19. Matrix Completion

The recovery of a data matrix from a subset of its entries is an extension of compressed sensing and sparse approximation. This chapter introduces matrix completion and matrix recovery. The ideas are also extended to tensor factorization and completion.

Ke-Lin Du, M. N. S. Swamy

Chapter 20. Kernel Methods

This chapter introduces the basics of the kernel method. Extensions of the kernel method to some traditional methods are also described. The SVM method will be described in the next chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 21. Support Vector Machines

SVM is one of the most popular nonparametric classification algorithms. It is optimal and is based on computational learning theory. This chapter is dedicated to SVM. We first introduce the SVM model. Training methods for classification, clustering, and regression using SVM are introduced in detail. Associated topics such as model architecture optimization are also described.

Ke-Lin Du, M. N. S. Swamy

Chapter 22. Probabilistic and Bayesian Networks

This chapter introduces several important probabilistic models. Bayesian network is a well-known probabilistic model in machine learning. Hidden Markov model is a special case of Bayesian network model for dynamic systems. Important probabilistic methods, including sampling methods, expectation–maximization method, variational Bayesian method, and mixture method, are introduced. Some Bayesian and probabilistic approaches to machine learning are also mentioned in this chapter.

Ke-Lin Du, M. N. S. Swamy

Chapter 23. Boltzmann Machines

Since its invention in 1985, the Boltzmann machine has long been treated as a model with mere historic significance to the machine learning community. In 2006, this model began to gain popularity when Hinton and collaborators achieved a breakthrough in deep learning, where restricted Boltzmann machine is the prime component of the deep neural network. In this chapter, we introduce the Boltzmann machine and its reduced form known as the restricted Boltzmann machine, as well as their learning algorithms. Related topics are also treated.

Ke-Lin Du, M. N. S. Swamy

Chapter 24. Deep Learning

The advent of deep learning has dramatically improved the state of the art in artificial intelligence (AI). Deep learning is regarded as the AI model closest to the human brain due to its deep structure. Deep learning has been widely used in pattern understanding and recognition fields that are traditionally hard to solve. This chapter introduces deep learning and deep learning networks.

Ke-Lin Du, M. N. S. Swamy

Chapter 25. Combining Multiple Learners: Data Fusion and Ensemble Learning

According to no-free-lunch theorem, there is no single method that always performs the best in any domain. In practice, many methods are available for solving a given problem, each having its limitations. A popular way of dealing with difficult problems is via brainstorming in which participants share their knowledge from different viewpoints, and collective wisdom is achieved by voting on the decision. Data fusion is a concept that combines the results of all these individual methods using ensemble learning. This chapter deals with ensemble learning.

Ke-Lin Du, M. N. S. Swamy

Chapter 26. Introduction to Fuzzy Sets and Logic

In many soft sciences (e.g., psychology, sociology, ethology), scientists provide verbal descriptions and explanations of various phenomena based on observations. Fuzzy logic provides the most suitable tool for verbal computation. It is a paradigm for modeling the uncertainty in human reasoning, and is a basic tool for machine learning and expert systems. This chapter introduces fuzzy sets and logic. Some associated topics on reasoning and granular computing are also described.

Ke-Lin Du, M. N. S. Swamy

Chapter 27. Neurofuzzy Systems

Hybridization of fuzzy logic and neural networks yields neurofuzzy systems, which capture the merits of both paradigms. This chapter first describes how to extract rules from neural networks and data, and then introduces how the synergy of fuzzy logic and neural network paradigms is implemented.

Ke-Lin Du, M. N. S. Swamy

Chapter 28. Neural Network Circuits and Parallel Implementations

Hardware and parallel implementations can substantially speed up machine learning algorithms to extend their widespread applications. In this chapter, we first introduce various circuit realizations for popular neural network learning methods. We then introduce their parallel implementations on graphic processing units (GPUs), systolic arrays of processors, and parallel computers.

Ke-Lin Du, M. N. S. Swamy

Chapter 29. Pattern Recognition for Biometrics and Bioinformatics

Biometrics and Bioinformatics are among the most important and successful applications of machine learning methods. Biometrics are physical characteristics of a person, and usually used for identification. Bioinformatics is related to extraction of information from an DNA or protein sequence. This chapter gives an introductory account on the two topics.

Ke-Lin Du, M. N. S. Swamy

Chapter 30. Data Mining

The wealth of information in huge databases or the Web has aroused tremendous interest in the area of data mining, also known as knowledge discovery in databases. This chapter introduces data mining. We first introduce neural network approach to data mining, and then address various data mining and information retrieval problems on the web.

Ke-Lin Du, M. N. S. Swamy

Chapter 31. Big Data, Cloud Computing, and Internet of Things

The era of big data has arrived. Big data and cloud computing go hand-in-hand. Internet of things (IoT) has resulted in a hyper-world consisting of the social, cyber, and physical worlds, with data as a bridge. These topics are closely related to data science and are introduced in this chapter.

Ke-Lin Du, M. N. S. Swamy

Backmatter

Title: Neural Networks and Statistical Learning
Authors: Dr. Ke-Lin Du
Prof. Dr. M. N. S. Swamy
Publisher: Springer London
Electronic ISBN: 978-1-4471-7452-3
Print ISBN: 978-1-4471-7451-6
DOI: https://doi.org/10.1007/978-1-4471-7452-3

Springer Professional

Neural Networks and Statistical Learning

About this book

Table of Contents

Frontmatter

Chapter 1. Introduction

Chapter 2. Fundamentals of Machine Learning

Chapter 3. Elements of Computational Learning Theory

Chapter 4. Perceptrons

Chapter 5. Multilayer Perceptrons: Architecture and Error Backpropagation

Chapter 6. Multilayer Perceptrons: Other Learing Techniques

Chapter 7. Hopfield Networks, Simulated Annealing, and Chaotic Neural Networks

Chapter 8. Associative Memory Networks

Chapter 9. Clustering I: Basic Clustering Models and Algorithms

Chapter 10. Clustering II: Topics in Clustering

Chapter 11. Radial Basis Function Networks

Chapter 12. Recurrent Neural Networks

Chapter 13. Principal Component Analysis

Chapter 14. Nonnegative Matrix Factorization

Chapter 15. Independent Component Analysis

Chapter 16. Discriminant Analysis

Chapter 17. Reinforcement Learning

Chapter 18. Compressed Sensing and Dictionary Learning

Chapter 19. Matrix Completion

Chapter 20. Kernel Methods

Chapter 21. Support Vector Machines

Chapter 22. Probabilistic and Bayesian Networks

Chapter 23. Boltzmann Machines

Chapter 24. Deep Learning

Chapter 25. Combining Multiple Learners: Data Fusion and Ensemble Learning

Chapter 26. Introduction to Fuzzy Sets and Logic

Chapter 27. Neurofuzzy Systems

Chapter 28. Neural Network Circuits and Parallel Implementations

Chapter 29. Pattern Recognition for Biometrics and Bioinformatics

Chapter 30. Data Mining

Chapter 31. Big Data, Cloud Computing, and Internet of Things

Backmatter

Premium Partner