Skip to main content
Top

2020 | Book

Deep Learning in Mining of Visual Content

insite
SEARCH

About this book

This book provides the reader with the fundamental knowledge in the area of deep learning with application to visual content mining. The authors give a fresh view on Deep learning approaches both from the point of view of image understanding and supervised machine learning. It contains chapters which introduce theoretical and mathematical foundations of neural networks and related optimization methods. Then it discusses some particular very popular architectures used in the domain: convolutional neural networks and recurrent neural networks.
Deep Learning is currently at the heart of most cutting edge technologies. It is in the core of the recent advances in Artificial Intelligence. Visual information in Digital form is constantly growing in volume. In such active domains as Computer Vision and Robotics visual information understanding is based on the use of deep learning. Other chapters present applications of deep learning for visual content mining. These include attention mechanisms in deep neural networks and application to digital cultural content mining. An additional application field is also discussed, and illustrates how deep learning can be of very high interest to computer-aided diagnostics of Alzheimer’s disease on multimodal imaging.
This book targets advanced-level students studying computer science including computer vision, data analytics and multimedia. Researchers and professionals working in computer science, signal and image processing may also be interested in this book.

Table of Contents

Frontmatter
Chapter 1. Introduction
Abstract
Visual content mining has a long history and has been a central problem in the field of Computer Vision. It consists in finding and correctly labelling objects in images or video sequences, recognition of static and dynamic scenes. It is necessary in a large set of research and application domains: multimedia indexing and retrieval, computer vision, robotics, computer-aided diagnosis using medical images…Humans are naturally good at performing visual scene recognition without any particular effort. However, automatic object and scene recognition still remains a challenging task.
Akka Zemmari, Jenny Benois-Pineau
Chapter 2. Supervised Learning Problem Formulation
Abstract
In machine learning we distinguish various approaches between two extreme ones: unsupervised and supervised learning. The task of unsupervised learning consists in grouping similar data points in the description space thus inducing a structure on it. Then the data model can be expressed in terms of space partition. Probably, the most popular of such grouping algorithms in visual content mining is the K-means approach introduced by MacQueen as early as in 1967, at least this is the approach which was used for the very popular Bag-of-Visual Words model we have mentioned in Chap. 1. The Deep learning approach is a part of the family of supervised learning methods designed both for classification and regression. In this very short chapter we will focus on the formal definition of supervised learning approach, but also on fundamentals of evaluation of classification algorithms as the evaluation metrics will be used further in the book.
Akka Zemmari, Jenny Benois-Pineau
Chapter 3. Neural Networks from Scratch
Abstract
Artificial neural networks consist of distributed information processing units. In this chapter, we define the components of such networks. We will first introduce the elementary unit: the formal neuron proposed by McCulloch and Pitts. Further we will explain how such units can be assembled to design simple neural networks.
Akka Zemmari, Jenny Benois-Pineau
Chapter 4. Optimization Methods
Abstract
The machine learning models aim to construct a prediction function which minimizes the loss function. There are many algorithms which aim to minimize the loss function. Most of them are iterative and operate by decreasing the loss function following a descent direction. These methods solve the problem when the loss function is supposed to be convex. The main idea can be expressed simply as follows: starting from initial arbitrary (or randomly) chosen point in the parameter space, they allow the “descent” to the minimum of the loss function accordingly to the chosen set of directions. Here we discuss some of the most known and used optimization algorithms in this field.
Akka Zemmari, Jenny Benois-Pineau
Chapter 5. Deep in the Wild
Abstract
In this chapter we are interested in how from high-resolution images and videos passing them through a Deep convolutional neural network we get reduced dimension which finally allows a classification decision. We are interested in two operations: convolution and pooling and trace analogy with these operations in a classical Image Processing framework.
Akka Zemmari, Jenny Benois-Pineau
Chapter 6. Convolutional Neural Networks as Image Analysis Tool
Abstract
After studies of fundamental operations of convolution and sub-sampling in previous chapter, we introduce here convolutional neural networks and consider those designed for particular data: images. First of all we will expose some general principles, then go into detail layer-by-layer and finally briefly overview most popular convolutional neural networks architectures.
Akka Zemmari, Jenny Benois-Pineau
Chapter 7. Dynamic Content Mining
Abstract
Neural networks and convolutional neural networks can be considered as functions which take as input a vector and compute a distribution over the set of possible classes. Such networks have no notion of order in time nor in memory. That is they are not suitable for dynamic content mining like speech recognition, video processing, etc. In this chapter we introduce models able to handle temporality of visual content.
Akka Zemmari, Jenny Benois-Pineau
Chapter 8. Case Study for Digital Cultural Content Mining
Abstract
In this chapter we consider an application case of Deep Learning in the task of architectural recognition. The main objective is to identify both: architectural styles and specific architectural structures. We are interested in attention mechanisms in Deep CNNs and explain how real visual attention maps built upon human gaze fixations can help in the training of deep neural networks.
Akka Zemmari, Jenny Benois-Pineau
Chapter 9. Introducing Domain Knowledge
Abstract
In this chapter we will consider another application case of Deep learning: classification of brain images for detection of Alzheimer’s disease. In this particular application of medical imaging domain, Deep NNs have become the mandatory tool. In this chapter we give some highlights on how the usual steps in design of a Deep Neural Network classifier are implemented in the case when domain knowledge has to be considered. But more than that: faithful to our principle of showing new aspects of Deep Learning even in application cases, we will show how information fusion with siamese CNNs helps in increasing of performances of these classifiers.
Akka Zemmari, Jenny Benois-Pineau
Backmatter
Metadata
Title
Deep Learning in Mining of Visual Content
Authors
Assoc. Prof. Akka Zemmari
Prof. Dr. Dr. Jenny Benois-Pineau
Copyright Year
2020
Electronic ISBN
978-3-030-34376-7
Print ISBN
978-3-030-34375-0
DOI
https://doi.org/10.1007/978-3-030-34376-7

Premium Partner