Computer Vision Projects with PyTorch
Design and Develop Production-Grade Models
- 2022
- Book
- Authors
- Akshay Kulkarni
- Adarsha Shivananda
- Nitin Ranjan Sharma
- Publisher
- Apress
About this book
Design and develop end-to-end, production-grade computer vision projects for real-world industry problems. This book discusses computer vision algorithms and their applications using PyTorch.
The book begins with the fundamentals of computer vision: convolutional neural nets, RESNET, YOLO, data augmentation, and other regularization techniques used in the industry. And then it gives you a quick overview of the PyTorch libraries used in the book. After that, it takes you through the implementation of image classification problems, object detection techniques, and transfer learning while training and running inference. The book covers image segmentation and an anomaly detection model. And it discusses the fundamentals of video processing for computer vision tasks putting images into videos. The book concludes with an explanation of the complete model building process for deep learning frameworks using optimized techniques with highlights on model AI explainability.
After reading this book, you will be able to build your own computer vision projects using transfer learning and PyTorch.
What You Will LearnSolve problems in computer vision with PyTorch.Implement transfer learning and perform image classification, object detection, image segmentation, and other computer vision applicationsDesign and develop production-grade computer vision projects for real-world industry problemsInterpret computer vision models and solve business problems
Who This Book Is For
Data scientists and machine learning engineers interested in building computer vision projects and solving business problems
Table of Contents
-
Frontmatter
-
Chapter 1. The Building Blocks of Computer Vision
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractHumans have been part of a natural evolutionary pattern for centuries. According to the Flynn Effect, an average person born in recent times has a higher IQ than the average person born in the previous century. Human intelligence allows us to learn, decide, and make new decisions based on our learnings. We use the IQ score to quantify human intelligence, but what about machines? Machines are also part of this evolutionary journey. How have we moved our focus to machines and made them intelligent, as we know them today? Let’s take a quick look at this history. -
Chapter 2. Image Classification
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractThe last chapter discussed several important concepts in computer vision. A few of the best practices in the field of computer vision were discussed as well, so it is time to practice those. This chapter sets the tone for multiple tasks in the field of computer vision. We start with a basic explanation of how to start using the Torch components to build a model, define a loss function, and train. -
Chapter 3. Building an Object Detection Model
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractObject detection is one of the most sought-after skills these days. An image can have multiple classes. In addition, classifying an object solves just part of the problem. The other part lies in the localization of the object. Object detection helps identify the class location of an image with a bounding box. The bounding box can be further processed for various sub-tasks. As an example, think about what a traffic cam needs to detect and identify cars. -
Chapter 4. Building an Image Segmentation Model
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractImages around us come in different textures, patterns, shapes, and sizes. They carry an enormous amount of information which can easily be understood by the human eye and brain, but is still a difficult problem for computers. Image segmentation is a problem set wherein we try to train computers to understand images so that they can separate dissimilar objects and unite similar objects. This can be in the form of similar pixel intensities or similar textures and shapes. -
Chapter 5. Image-Based Search and Recommendation System
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractIn order to retain and acquire new customers, especially in the e-commerce arena, customer service needs to be top-notch. There are already thousands of e-commerce platforms and the number will only increase in the future. Platforms with excellent customer experience will survive in long term. -
Chapter 6. Pose Estimation
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractHuman pose estimation (HPE) is a computer vision task that detects human poses by estimating major keypoints, such as eyes, ears, hands, and legs, in a given frame/video. Figure 6-1 shows an example of human pose estimation in action. -
Chapter 7. Image Anomaly Detection
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractThe study of machine learning has put us in the course of studying various patterns and behavior. It has allowed us to build models that can study closed environments. Predictive power often follows the model training process. It is an important question that we need to ask often when we are training a model. There is another question that needs an answer—how much data is sufficient to help the model understand the distribution such that we can have a good representation? This chapter will work out an example and the concepts regarding these important questions. We are discussing anomaly detection in computer vision. -
Chapter 8. Image Super-Resolution
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractWith the advent of high-resolution image capturing agents, the information captured in images is huge. Technology has moved from ultra HD to 4K and 8K resolutions. Movies are using high-resolution frames these days; however, there are also situations when they need to enhance a low-resolution image to a high-resolution one. Imagine a scene where the protagonist of a movie is trying to determine the license plate captured from a picture of a speeding car. Super-resolution can now help us zoom into an image to a high degree without distorting it. A few interesting advancements have happened in the industry and we are going to discuss those with some examples. -
Chapter 9. Video Analytics
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractThe machine learning journey started from structured data long ago to the process of extracting meaningful predictions. As data grew, machine learning started exploring other data types as well. Today, there is no limit to the types of data that can be processed. -
Chapter 10. Explainable AI for Computer Vision
Akshay Kulkarni, Adarsha Shivananda, Nitin Ranjan SharmaAbstractMost machine learning and deep learning models lack a way of explaining and interpreting results. Due to the dynamic nature of deep learning models and increasing state-of-the-art models, the current model evaluation is based on accuracy scores. This makes machine learning and deep learning black-box models. This leads to lack of confidence in applying the model and lack of trust of the generated results. There are multiple libraries that help us explain models of structured data like SHAP and LIME. This chapter explains computer vision model outputs. -
Backmatter
- Title
- Computer Vision Projects with PyTorch
- Authors
-
Akshay Kulkarni
Adarsha Shivananda
Nitin Ranjan Sharma
- Copyright Year
- 2022
- Publisher
- Apress
- Electronic ISBN
- 978-1-4842-8273-1
- Print ISBN
- 978-1-4842-8272-4
- DOI
- https://doi.org/10.1007/978-1-4842-8273-1
Accessibility information for this book is coming soon. We're working to make it available as quickly as possible. Thank you for your patience.