Skip to main content
main-content

Über dieses Buch

Based on the successful 2014 book published by Apress, this textbook edition is expanded to provide a comprehensive history and state-of-the-art survey for fundamental computer vision methods. With over 800 essential references, as well as chapter-by-chapter learning assignments, both students and researchers can dig deeper into core computer vision topics. The survey covers everything from feature descriptors, regional and global feature metrics, feature learning architectures, deep learning, neuroscience of vision, neural networks, and detailed example architectures to illustrate computer vision hardware and software optimization methods.
To complement the survey, the textbook includes useful analyses which provide insight into the goals of various methods, why they work, and how they may be optimized.
The text delivers an essential survey and a valuable taxonomy, thus providing a key learning tool for students, researchers and engineers, to supplement the many effective hands-on resources and open source projects, such as OpenCVand other imaging and deep learning tools.

Inhaltsverzeichnis

Frontmatter

1. Image Capture and Representation

Abstract
Computer vision starts with images. This chapter surveys a range of topics dealing with capturing, processing, and representing images, including computational imaging, 2D imaging, and 3D depth imaging methods, sensor processing, depth-field processing for stereo and monocular multi-view stereo, and surface reconstruction. A high-level overview of selected topics is provided, with references for the interested reader to dig deeper. Readers with a strong background in the area of 2D and 3D imaging may benefit from a light reading of this chapter.
Scott Krig

2. Image Pre-Processing

Abstract
This chapter describes the methods used to prepare images for further analysis, including interest point and feature extraction. Some of these methods are also useful for global and local feature description, particularly the metrics derived from transforms and basis spaces. The focus is on image preprocessing for computer vision, so we do not cover the entire range of image processing topics applied to areas such as computational photography and photo enhancements, so we refer the interested reader to various other standard resources in Digital Image Processing and Signal Processing as we go along [4, 9, 325, 326], and we also point out interesting research papers that will enhance understanding of the topics.
Scott Krig

3. Global and Regional Features

Abstract
This chapter covers the metrics of general feature description, often used for whole images and image regions, including textural, statistical, model based, and basis space methods. Texture, a key metric, is a well-known topic within image processing, and it is commonly divided into structural and statistical methods. Structural methods look for features such as edges and shapes, while statistical methods are concerned with pixel value relationships and statistical moments. Methods for modeling image texture also exist, primarily useful for image synthesis rather than for description. Basis spaces, such as the Fourier space, are also use for feature description.
Scott Krig

4. Local Feature Design Concepts

Abstract
In this chapter we examine several concepts related to local feature descriptor design—namely local patterns, shapes, spectra, distance functions, classification, matching, and object recognition. The main focus is local feature metrics, as shown in Fig. 4.1. This discussion follows the general vision taxonomy that is presented in Chap. 5, and includes key fundamentals for understanding interest point detectors and feature descriptors, as surveyed in Chap. 6, including selected concepts common to both detector and descriptor methods. Note that the opportunity always exists to modify as well as mix and match detectors and descriptors to achieve the best results.
Scott Krig

5. Taxonomy of Feature Description Attributes

Abstract
This chapter develops a general Vision Metrics Taxonomy for feature description, so as to collect summary descriptor attributes for high-level analysis. The taxonomy includes a set of general robustness criteria for feature description and ground truth datasets. The material presented and discussed in this book follows and reflects this taxonomy. By developing a standard vocabulary in the taxonomy, terms and techniques are intended to be consistently communicated and better understood. The taxonomy is used in the survey of feature descriptor methods in Chap. 6 to record “what” practitioners are doing.
Scott Krig

6. Interest Point Detector and Feature Descriptor Survey

Abstract
Many algorithms for computer vision rely on locating interest points, or keypoints in each image, and calculating a feature description from the pixel region surrounding the interest point. This is in contrast to methods such as correlation, where a larger rectangular pattern is stepped over the image at pixel intervals and the correlation is measured at each location. The interest point is the, and often provides the scale, rotational, and illumination invariance attributes for the descriptor; the descriptor adds more detail and more invariance attributes. Groups of interest points and descriptors together describe the actual objects.
Scott Krig

7. Ground Truth Data, Content, Metrics, and Analysis

Abstract
This chapter discusses several topics pertaining to ground truth data, the basis for computer vision metric analysis. We look at examples to illustrate the importance of ground truth data design and use, including manual and automated methods. We then propose a method and corresponding ground truth dataset for measuring interest point detector response as compared to human visual system response and human expectations. Also included here are example applications of the general robustness criteria and the general vision taxonomy developed in Chap. 5 as applied to the preparation of hypothetical ground truth data. Lastly, we look at the current state of the art, its best practices, and a survey of available ground truth datasets.
Scott Krig

8. Vision Pipelines and Optimizations

Abstract
This chapter explores some hypothetical computer vision pipeline designs to understand HW/SW design alternatives and optimizations. Instead of looking at isolated computer vision algorithms, this chapter ties together many concepts into complete vision pipelines. Vision pipelines are sketched out for a few example applications to illustrate the use of different methods. Example applications include object recognition using shape and color for automobiles, face detection and emotion detection using local features, image classification using global features, and augmented reality. The examples have been chosen to illustrate the use of different families of feature description metrics within the Vision Metrics Taxonomy presented in Chap. 5. Alternative optimizations at each stage of the vision pipeline are explored. For example, we consider which vision algorithms run better on a CPU versus a GPU, and discuss how data transfer time between compute units and memory affects performance.
Scott Krig

9. Feature Learning Architecture Taxonomy and Neuroscience Background

Abstract
In many respects, computer vision practitioners are now being outpaced by neuroscientists, who are leading the way, modeling computer vision systems directly after neurobiology, and borrowing from computer vision and imaging to simulate the biology and theories of the human visual system. The state of the art in computer vision is rapidly moving towards synthetic brains and synthetic vision systems, similar to other biological sciences where we see synthetic biology such as prosthetics, robotics, and genomic engineering. Computer vision is becoming a subset of neuroscience and vision sciences, where researchers implement complete synthetic vision models, leveraging computer vision and imaging methods, leaving some computer vision and imaging methods in the wake of history.
Scott Krig

10. Feature Learning and Deep Learning Architecture Survey

Abstract
In this chapter we look at a wide range of feature learning architectures and deep learning architectures, which incorporate a range of feature models and classification models. This chapter digs deeper into the background concepts of feature learning and artificial neural networks summarized in the taxonomy of Chap. 9, and complements the local and regional feature descriptor surveys in Chaps. 36. The architectures in the survey represent significant variations across neural-network approaches, local feature descriptor and classification based approaches, and ensemble approaches. The architecture taken together as the sum of its parts is apparently more important than individual parts or components of the design, such as the choice of feature descriptor, number of levels in the feature hierarchy, number of features per layer, or the choice of classifier. Good results are being reported across a wide range of architectures.
Scott Krig

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise