Skip to main content

2014 | Buch

Neural Networks and Statistical Learning

insite
SUCHEN

Über dieses Buch

Providing a broad but in-depth introduction to neural network and machine learning in a statistical framework, this book provides a single, comprehensive resource for study and further research. All the major popular neural network models and statistical learning approaches are covered with examples and exercises in every chapter to develop a practical working understanding of the content.

Each of the twenty-five chapters includes state-of-the-art descriptions and important research results on the respective topics. The broad coverage includes the multilayer perceptron, the Hopfield network, associative memory models, clustering models and algorithms, the radial basis function network, recurrent neural networks, principal component analysis, nonnegative matrix factorization, independent component analysis, discriminant analysis, support vector machines, kernel methods, reinforcement learning, probabilistic and Bayesian networks, data fusion and ensemble learning, fuzzy sets and logic, neurofuzzy models, hardware implementations, and some machine learning topics. Applications to biometric/bioinformatics and data mining are also included.

Focusing on the prominent accomplishments and their practical aspects, academic and technical staff, graduate students and researchers will find that this provides a solid foundation and encompassing reference for the fields of neural networks, pattern recognition, signal processing, machine learning, computational intelligence,

and data mining.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introduction
Abstract
The discipline of neural networks models the human brain. The average human brain consists of nearly 1011 neurons of various types, with each neuron connecting to up to tens of thousands of synapses.
Ke-Lin Du, M. N. S. Swamy
Chapter 2. Fundamentals of Machine Learning
Abstract
Learning is a fundamental capability of neural networks. Learning rules are algorithms for finding suitable weights W and/or other network parameters.
Ke-Lin Du, M. N. S. Swamy
Chapter 3. Perceptrons
Abstract
The perceptron [38], also referred to as a McCulloch-Pitts neuron or linear threshold gate, is the earliest and simplest neural network model. Rosenblatt used a single-layer perceptron for the classification of linearly separable patterns.
Ke-Lin Du, M. N. S. Swamy
Chapter 4. Multilayer Perceptrons: Architecture and Error Backpropagation
Abstract
MLPs are feedforward networks with one or more layers of units between the input and output layers. The output units represent a hyperplane in the space of the input patterns. The architecture of MLP is illustrated in Fig. 4.1.
Ke-Lin Du, M. N. S. Swamy
Chapter 5. Multilayer Perceptrons: Other Learning Techniques
Abstract
Training of feedforward networks can be viewed as an unconstrained optimization problem. BP is slow to converge when the error surface is flat along a weight dimension. Second-order optimization techniques have a strong theoretical basis and provide significantly faster convergence.
Ke-Lin Du, M. N. S. Swamy
Chapter 6. Hopfield Networks, Simulated Annealing, and Chaotic Neural Networks
Abstract
The Hopfield model [27, 28] is the most popular dynamic model. It is biologically plausible since it functions like the human retina [36].
Ke-Lin Du, M. N. S. Swamy
Chapter 7. Associative Memory Networks
Abstract
The human brain stores the information in synapses or in reverberating loops of electrical activity. Most of existing associative memory models store information in synapses.
Ke-Lin Du, M. N. S. Swamy
Chapter 8. Clustering I: Basic Clustering Models and Algorithms
Abstract
Clustering is a fundamental tool for data analysis. It finds wide applications in many engineering and scientific fields including pattern recognition, feature extraction, vector quantization, image segmentation, bioinformatics, and data mining. Clustering is a classical method for the prototype selection of kernel-based neural networks such as the RBF network, and is most useful for neurofuzzy systems.
Ke-Lin Du, M. N. S. Swamy
Chapter 9. Clustering II: Topics in Clustering
Abstract
Conventional competitive learning-based clustering algorithms like \(C\)-means and LVQ are plagued by a severe initialization problem [57, 106]. If the initial values of the prototypes are not in the convex hull formed by the input data, clustering may not produce meaningful results.
Ke-Lin Du, M. N. S. Swamy
Chapter 10. Radial Basis Function Networks
Abstract
Learning is an approximation problem, which is closely related to the conventional approximation techniques, such as generalized splines and regularization techniques. The RBF network has its origin in performing exact interpolation of a set of data points in a multidimensional space [81]. The RBF network is a universal approximator, and it is a popular alternative to the MLP, since it has a simpler structure and a much faster training process. Both models are widely used for classification and function approximation.
Ke-Lin Du, M. N. S. Swamy
Chapter 11. Recurrent Neural Networks
Abstract
The brain is a strongly recurrent structure. This massive recurrence suggests a major role of self-feeding dynamics in the processes of perceiving, acting and learning, and in maintaining the organism alive
Ke-Lin Du, M. N. S. Swamy
Chapter 12. Principal Component Analysis
Abstract
Most signal-processing problems can be reduced to some form of eigenvalue or singular-value problems.
Ke-Lin Du, M. N. S. Swamy
Chapter 13. Nonnegative Matrix Factorization
Abstract
Matrix factorization or factor analysis is an important task that is helpful in the analysis of high-dimensional real-world data. SVD is a classical method for matrix factorization, which gives the optimal low-rank approximation to a real-valued matrix in terms of the squared error. Many application areas, including information retrieval, pattern recognition, and data mining, require processing of binary rather than real data.
Ke-Lin Du, M. N. S. Swamy
Chapter 14. Independent Component Analysis
Abstract
Imagine that you are attending a cocktail party, the surrounding is full of chatting and noise, and somebody is talking about you. In this case, your ears are particularly sensitive to this speaker. This is the cocktail-party problem, which can be solved by blind source separation (BSS).
Ke-Lin Du, M. N. S. Swamy
Chapter 15. Discriminant Analysis
Abstract
Discriminant analysis plays an important role in statistical pattern recognition. LDA, originally derived by Fisher, is one of the most popular discriminant analysis techniques. Under the assumption that the class distributions are identically distributed Gaussians, LDA is Bayes optimal [44]. Like PCA, LDA is widely applied to image retrieval, face recognition, information retrieval, and pattern recognition.
Ke-Lin Du, M. N. S. Swamy
Chapter 16. Support Vector Machines
Abstract
SVM [12, 201] is one of the most popular nonparametric classification algorithms. It is optimal and is based on computational learning theory [200, 202]. The goal of SVM is to minimize the VC dimension by finding the optimal hyperplane between classes, with the maximal margin, where the margin is defined as the distance of the closest point in each class to the separating hyperplane. It has a general-purpose linear learning algorithm and a problem-specific kernel that computes the inner product of input data points in a feature space. The key idea of SVM is to project the training set in a high-dimensional space into a lower-dimensional feature space by means of a set of nonlinear kernel functions, where the projections of the training examples are always linearly separable in the feature space. The hippocampus, a brain region critical for learning and memory processes, has been reported to possess pattern separation function similar to SVM [6].
Ke-Lin Du, M. N. S. Swamy
Chapter 17. Other Kernel Methods
Abstract
The kernel method was originally invented in Aizerman et al. (Autom. Remote Control, 25, 821–837, 1964). The key idea is to project the training set in a lower-dimensional space into a high-dimensional kernel (feature) space by means of a set of nonlinear kernel functions.
Ke-Lin Du, M. N. S. Swamy
Chapter 18. Reinforcement Learning
Abstract
Reinforcement learning has its origin in the psychology of animal learning. It awards the learner (agent) for correct actions, and punishes for wrong actions. In the mammalian brain, learning by reinforcement is a function of brain nuclei known as the basal ganglia. The basal ganglia uses this reward-related information to modulate sensory-motor pathways so as to render future behaviors more rewarding [16].
Ke-Lin Du, M. N. S. Swamy
Chapter 19. Probabilistic and Bayesian Networks
Abstract
The Bayesian network model was introduced by Pearl in 1985 [147]. It is the best known family of graphical models in artificial intelligence (AI). Bayesian networks are a powerful tool of common knowledge representation and reasoning for partial beliefs under uncertainty. They are probabilistic models that combine probability theory and graph theory.
Ke-Lin Du, M. N. S. Swamy
Chapter 20. Combining Multiple Learners: Data Fusion and Emsemble Learning
Abstract
Different learning algorithms have different accuracies. The no free lunch theorem asserts that no single learning algorithm always achieves the best performance in any domain. They can be combined to attain higher accuracy. Data fusion is the process of fusing multiple records representing the same real-world object into a single, consistent, and clean representation. Fusion of data for improving prediction accuracy and reliability is an important problem in machine learning.
Ke-Lin Du, M. N. S. Swamy
Chapter 21. Introduction to Fuzzy Sets and Logic
Abstract
In many soft sciences (e.g., psychology, sociology, ethology), scientists provide verbal descriptions and explanations of various phenomena based on observations.
Ke-Lin Du, M. N. S. Swamy
Chapter 22. Neurofuzzy Systems
Abstract
The neurofuzzy system is inspired by the biological-cognitive synergism in human intelligence. It is the synergism between the neuronal transduction/processing of sensory signals, and the corresponding cognitive, perceptual, and linguistic functions of the brain.
Ke-Lin Du, M. N. S. Swamy
Chapter 23. Neural Circuits and Parallel Implementation
Abstract
Hardware technologies for implementing neural networks can be either analog or digital. Analog hardware is a good choice. The design of analog chips requires good theoretical knowledge of transistor physics as well as experience. Weights in a neural network can be coded by one single analog element (e.g., a resistor). Very simple rules such as Kirchoff’s laws can be used to carry out addition of input signals. As an example, Boltzmann machines can be easily implemented by amplifying the natural noise present in analog devices.
Ke-Lin Du, M. N. S. Swamy
Chapter 24. Pattern Recognition for Biometrics and Bioinformatics
Abstract
Biometrics are the personal or physical characteristics of a person. These biometric identities are usually used for identification or verification. Biometric recognition systems are increasingly being deployed as a more natural, more secure, and more efficient means than the conventional password-based method for the recognition of people. Many biometric verification systems have been developed for global security.
Ke-Lin Du, M. N. S. Swamy
Chapter 25. Data Mining
Abstract
The Web is the world’s largest source of information. It records the real world from many aspects at every moment. This success is somewhat thanks to XML-based technology, which provides a means of information interchange between applications, as well as a semistructured data model for integrating information and knowledge. Information retrieval has enabled the development of useful web search engines. Relevance criteria based on both textual contents and link structures are very useful for effectively retrieving text-rich documents.
Ke-Lin Du, M. N. S. Swamy
Backmatter
Metadaten
Titel
Neural Networks and Statistical Learning
verfasst von
Ke-Lin Du
M. N. S. Swamy
Copyright-Jahr
2014
Verlag
Springer London
Electronic ISBN
978-1-4471-5571-3
Print ISBN
978-1-4471-5570-6
DOI
https://doi.org/10.1007/978-1-4471-5571-3

Premium Partner