Skip to main content
Top

2018 | Book

Practical Computer Vision Applications Using Deep Learning with CNNs

With Detailed Examples in Python Using TensorFlow and Kivy

insite
SEARCH

About this book

Deploy deep learning applications into production across multiple platforms. You will work on computer vision applications that use the convolutional neural network (CNN) deep learning model and Python. This book starts by explaining the traditional machine-learning pipeline, where you will analyze an image dataset. Along the way you will cover artificial neural networks (ANNs), building one from scratch in Python, before optimizing it using genetic algorithms.

For automating the process, the book highlights the limitations of traditional hand-crafted features for computer vision and why the CNN deep-learning model is the state-of-art solution. CNNs are discussed from scratch to demonstrate how they are different and more efficient than the fully connected ANN (FCNN). You will implement a CNN in Python to give you a full understanding of the model.
After consolidating the basics, you will use TensorFlow to build a practical image-recognition model that you will deploy to a web server using Flask, making it accessible over the Internet. Using Kivy and NumPy, you will create cross-platform data science applications with low overheads.
This book will help you apply deep learning and computer vision concepts from scratch, step-by-step from conception to production.

What You Will Learn Understand how ANNs and CNNs work Create computer vision applications and CNNs from scratch using PythonFollow a deep learning project from conception to production using TensorFlowUse NumPy with Kivy to build cross-platform data science applications
Who This Book Is ForData scientists, machine learning and deep learning engineers, software developers.

Table of Contents

Frontmatter
Chapter 1. Recognition in Computer Vision
Abstract
Most computer science research tries to build a human-like robot that is able to function exactly as humans. Even emotional properties are not impossible for such robots. Using a sensor, the robot feels the temperature in the surrounding environment. Using facial expressions, it is possible to know whether a person is sad or happy. Even things that seem impossible might eventually only be challenging.
Ahmed Fawzy Gad
Chapter 2. Artificial Neural Networks
Abstract
Machine learning (ML) problems can be divided into three categories: supervised, unsupervised, and reinforcement. In supervised learning, a human expert conducts some experiments in a restricted environment and notices their results. The supervised learning algorithm explores the data collected from experiments to map inputs to outputs. For example, a restricted environment might have a robot that wants to go from one side of a small room to another. There are some obstacles in the room that may make the robot fall. The supervisor provides guidance about how to reach the wall without falling. This is done by giving the robot knowledge in the form of examples to help it learn how to pass obstacles. The robot uses this knowledge to increase the probability of passing the obstacle without falling. In such a case, the knowledge of the robot is completely dependent on the human.
Ahmed Fawzy Gad
Chapter 3. Recognition Using ANN with Engineered Features
Abstract
The three pillars for a successful ML application are the data, features, and model. They should cope with each other. The most relevant features that differentiate among the different cases existing in the data are used. Representative features are critical in building an accurate ML application. They should be accurate enough to work well under different conditions such as a change in scale and rotation. Such features should work well with the selected ML model. You shouldn’t use more features than needed, because this adds more complexity to the model. Feature selection and reduction techniques are used to find the minimum set of features to build an accurate model.
Ahmed Fawzy Gad
Chapter 4. ANN Optimization
Abstract
Before the innovation of automatic feature learning approaches, a data scientist was asked to know what features to use, which model to use, how to optimize the result, and more. With the existence of huge amounts of data and high-speed devices, DL is available to automatically deduce the best features. Two of the core tasks of a data scientist are model design and optimization.
Ahmed Fawzy Gad
Chapter 5. Convolutional Neural Networks
Abstract
The previously discussed architecture of ANNs is called FC neural networks (FCNNs). The reason is that each neuron in a layer i is connected to all neurons in layers i-1 and i+1. Each connection between two neurons has two parameters: the weight and the bias. Adding more layers and neurons increases the number of parameters. As a result, it is very time-consuming to train such networks even on devices on multiple graphics processing units (GPUs) and multiple central processing units (CPUs). It becomes impossible to train such networks on PCs with limited processing and memory capabilities.
Ahmed Fawzy Gad
Chapter 6. TensorFlow Recognition Application
Abstract
Building a DL model such as CNN from scratch using NumPy as we did helps us have a better understanding of how each layer works in detail. For practical applications, it is not recommended to use such implementation. One reason is that it is computationally intensive in its calculations and needs efforts to optimize the code. Another is that it does not support distributed processing, GPUs, and many more features. On the other hand, there are different already existing libraries that support these features in a time-efficient manner. These libraries include TF, Keras, Theano, PyTorch, Caffe, and more.
Ahmed Fawzy Gad
Chapter 7. Deploying Pretrained Models
Abstract
In the pipeline of building DL models, creating the model is the hardest step, but it is not the end. In order to benefit from the created models, users should remotely access them. Users’ feedback will help improve the model performance.
Ahmed Fawzy Gad
Chapter 8. Cross-Platform Data Science Applications
Abstract
There are releases from the current DL libraries that support building applications for mobile devices. For example, TensorFlowLite, Caffe Android, and Torch Android are all releases from TF, Caffe, and Torch, respectively, to support mobile devices. These releases are based on their parents. There must be an in-between step in order to make the original model work on mobile devices. For example, the process of creating an Android application that uses TensorFlowLite has the following summarized steps:
Ahmed Fawzy Gad
Backmatter
Metadata
Title
Practical Computer Vision Applications Using Deep Learning with CNNs
Author
Ahmed Fawzy Gad
Copyright Year
2018
Publisher
Apress
Electronic ISBN
978-1-4842-4167-7
Print ISBN
978-1-4842-4166-0
DOI
https://doi.org/10.1007/978-1-4842-4167-7

Premium Partner