nach oben

2022 | Buch

Kapitel lesen Erstes Kapitel lesen

Geometry of Deep Learning

A Signal Processing Perspective

verfasst von: Prof. Jong Chul Ye

Verlag: Springer Nature Singapore

Buchreihe : Mathematics in Industry

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The focus of this book is on providing students with insights into geometry that can help them understand deep learning from a unified perspective. Rather than describing deep learning as an implementation technique, as is usually the case in many existing deep learning books, here, deep learning is explained as an ultimate form of signal processing techniques that can be imagined.

To support this claim, an overview of classical kernel machine learning approaches is presented, and their advantages and limitations are explained. Following a detailed explanation of the basic building blocks of deep neural networks from a biological and algorithmic point of view, the latest tools such as attention, normalization, Transformer, BERT, GPT-3, and others are described. Here, too, the focus is on the fact that in these heuristic approaches, there is an important, beautiful geometric structure behind the intuition that enables a systematic understanding. A unified geometric analysis to understand the working mechanism of deep learning from high-dimensional geometry is offered. Then, different forms of generative models like GAN, VAE, normalizing flows, optimal transport, and so on are described from a unified geometric perspective, showing that they actually come from statistical distance-minimization problems.

Because this book contains up-to-date information from both a practical and theoretical point of view, it can be used as an advanced deep learning textbook in universities or as a reference source for researchers interested in acquiring the latest deep learning algorithms and their underlying principles. In addition, the book has been prepared for a codeshare course for both engineering and mathematics students, thus much of the content is interdisciplinary and will appeal to students from both disciplines.

Inhaltsverzeichnis

Frontmatter

Basic Tools for Machine Learning

Frontmatter

Chapter 1. Mathematical Preliminaries

Abstract

In this chapter, we briefly review the basic mathematical concepts that are required to understand the materials of this book.

Jong Chul Ye

Chapter 2. Linear and Kernel Classifiers

Abstract

Classification is one of the most basic tasks in machine learning. In computer vision, an image classifier is designed to classify input images in corresponding categories. Although this task appears trivial to humans, there are considerable challenges with regard to automated classification by computer algorithms.

Jong Chul Ye

Chapter 3. Linear, Logistic, and Kernel Regression

Abstract

In machine learning, regression analysis refers to a process for estimating the relationships between dependent variables and independent variables. This method is mainly used to predict and find the cause-and-effect relationship between variables. For example, in a linear regression, a researcher tries to find the line that best fits the data according to a certain mathematical criterion (see Fig. 3.1a).

Jong Chul Ye

Chapter 4. Reproducing Kernel Hilbert Space, Representer Theorem

Abstract

One of the key concepts in machine learning is the feature space, which is often referred to as the latent space. A feature space is usually a higher or lower-dimensional space than the original one where the input data lie (which is often referred to as the ambient space).

Jong Chul Ye

Building Blocks of Deep Learning

Frontmatter

Chapter 5. Biological Neural Networks

Abstract

A biological neural network is composed of a group of connected neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be significantly high. One of the amazing aspects of biological neural networks is that when the neurons are connected to each other, higher-level intelligence, which cannot be observed from a single neuron, emerges.

Jong Chul Ye

Chapter 6. Artificial Neural Networks and Backpropagation

Abstract

Inspired by the biological neural network, here we discuss its mathematical abstraction known as the artificial neural network (ANN). Although efforts have been made to model all aspects of the biological neuron using a mathematical model, all of them may not be necessary: rather, there are some key aspects that should not be neglected when modeling a neuron. This includes the weight adaptation and the nonlinearity.

Jong Chul Ye

Chapter 7. Convolutional Neural Networks

Abstract

A convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, widely used for analyzing and processing images. Multilayer perceptrons, which we discussed in the previous chapter, usually require fully connected networks, where each neuron in one layer is connected to all neurons in the next layer. Unfortunately, this type of connections inescapably increases the number of weights.

Jong Chul Ye

Chapter 8. Graph Neural Networks

Abstract

Many important real-world data sets are available in the form of graphs or networks: social networks, world-wide web (WWW), protein-interaction networks, brain networks, molecule networks, etc. See some examples in Fig. 8.1. In fact, the complex interaction in real systems can be described by different forms of graphs, so that graphs can be a ubiquitous tool for representing complex systems.

Jong Chul Ye

Chapter 9. Normalization and Attention

Abstract

In this chapter, we will discuss very exciting and rapidly evolving technical fields of deep learning: normalization and attention.

Jong Chul Ye

Advanced Topics in Deep Learning

Frontmatter

Chapter 10. Geometry of Deep Neural Networks

Abstract

In this chapter, which is mathematically intensive, we will try to answer perhaps the most important questions of machine learning: what does the deep neural network learn? How does a deep neural network, especially a CNN, accomplish these goals? The full answer to these basic questions is still a long way off. Here are some of the insights we’ve obtained while traveling towards that destination. In particular, we explain why the classic approaches to machine learning such as single-layer perceptron or kernel machines are not enough to achieve the goal and why a modern CNN turns out to be a promising tool.

Jong Chul Ye

Chapter 11. Deep Learning Optimization

Abstract

In Chap. 6, we discussed various optimization methods for deep neural network training. Although they are in various forms, these algorithms are basically gradient-based local update schemes. However, the biggest obstacle recognized by the entire community is that the loss surfaces of deep neural networks are extremely non-convex and not even smooth. This non-convexity and non-smoothness make the optimization unaffordable to analyze, and the main concern was whether popular gradient-based approaches might fall into local minimizers.

Jong Chul Ye

Chapter 12. Generalization Capability of Deep Learning

Abstract

One of the main reasons for the enormous success of deep neural networks is their amazing ability to generalize, which seems mysterious from the perspective of classic machine learning. In particular, the number of trainable parameters in deep neural networks is often greater than the training data set, this situation being notorious for overfitting from the point of view of classical statistical learning theory. However, empirical results have shown that a deep neural network generalizes well at the test phase, resulting in high performance for the unseen data.

Jong Chul Ye

Chapter 13. Generative Models and Unsupervised Learning

Abstract

The last part of our voyage toward the understanding of the geometry of deep learning concerns perhaps the most exciting aspect of deep learning—generative models.

Jong Chul Ye

Chapter 14. Summary and Outlook

Abstract

With the tremendous success of deep learning in recent years, the field of data science has undergone unprecedented changes that can be considered a “revolution”. Despite the great successes of deep learning in various areas, there is a tremendous lack of rigorous mathematical foundations which enable us to understand why deep learning methods perform well.

Jong Chul Ye

Chapter 15. Bibliography

Abstract

Jong Chul Ye

Backmatter

Titel: Geometry of Deep Learning
verfasst von: Prof. Jong Chul Ye
Verlag: Springer Nature Singapore
Electronic ISBN: 978-981-16-6046-7
Print ISBN: 978-981-16-6045-0
DOI: https://doi.org/10.1007/978-981-16-6046-7