Skip to main content
main-content

Über dieses Buch

The practice of robotics and computer vision both involve the application of computational algorithms to data. Over the fairly recent history of the fields of robotics and computer vision a very large body of algorithms has been developed. However this body of knowledge is something of a barrier for anybody entering the field, or even looking to see if they want to enter the field — What is the right algorithm for a particular problem?, and importantly, How can I try it out without spending days coding and debugging it from the original research papers?

The author has maintained two open-source MATLAB Toolboxes for more than 10 years: one for robotics and one for vision. The key strength of the Toolboxes provide a set of tools that allow the user to work with real problems, not trivial examples. For the student the book makes the algorithms accessible, the Toolbox code can be read to gain understanding, and the examples illustrate how it can be used —instant gratification in just a couple of lines of MATLAB code. The code can also be the starting point for new work, for researchers or students, by writing programs based on Toolbox functions, or modifying the Toolbox code itself.

The purpose of this book is to expand on the tutorial material provided with the toolboxes, add many more examples, and to weave this into a narrative that covers robotics and computer vision separately and together. The author shows how complex problems can be decomposed and solved using just a few simple lines of code, and hopefully to inspire up and coming researchers. The topics covered are guided by the real problems observed over many years as a practitioner of both robotics and computer vision. It is written in a light but informative style, it is easy to read and absorb, and includes a lot of Matlab examples and figures. The book is a real walk through the fundamentals of robot kinematics, dynamics and joint level control, then camera models, image processing, feature extraction and epipolar geometry, and bring it all together in a visual servo system.

Additional material is provided at http://www.petercorke.com/RVC

Inhaltsverzeichnis

Frontmatter

Introduction

Introduction

Abstract
The term robot means different things to different people. Science fiction books and movies have strongly influenced what many people expect a robot to be or what it can do. Sadly the practice of robotics is far behind this popular conception. One thing is certain though – robotics will be an important technology in this century. Products such as vacuum cleaning robots are the vanguard of a wave of smart machines that will appear in our homes and workplaces.
Peter Corke

Part I Foundations

Frontmatter

Representing Position and Orientation

Abstract
A fundamental requirement in robotics and computer vision is to represent the position and orientation of objects in an environment. Such objects include robots, cameras, workpieces, obstacles and paths.
A point in space is a familiar concept from mathematics and can be described by a coordinate vector, also known as a bound vector, as shown in Fig. 2.1a. The vector represents the displacement of the point with respect to some reference coordinate frame. A coordinate frame, or Cartesian coordinate system, is a set of orthogonal axes which intersect at a point known as the origin.
Peter Corke

Time and Motion

Abstract
In the previous chapter we learnt how to describe the pose of objects in 2- or 3-dimensional space. This chapter extends those concepts to objects whose pose is varying as a function of time.
For robots we wish to create a time varying pose that the robot can follow, for example the pose of a robot’s end-effector should follow a path to the object that it is to grasp. Section 3.1 discusses how to generate a temporal sequence of poses, a trajectory, that smoothly changes from an initial pose to a final pose.
Section 3.2 discusses the concept of rate of change of pose, its temporal derivative, and how that relates to concepts from mechanics such as velocity and angular velocity. This allows us to solve the inverse problem – given measurements from velocity and angular velocity sensors how do we update the estimate of pose for a moving object. This is the principle underlying inertial navigation.
Peter Corke

Part II Mobile Robots

Frontmatter

Mobile Robot Vehicles

Abstract
This chapter discusses how a robot platform moves, that is, how its pose changes with time as a function of its control inputs. There are many different types of robot platform as shown on pages 61–63 but in this chapter we will consider only two which are important exemplars. The first is a wheeled vehicle like a car which operates in a 2-dimensional world. It can be propelled forwards or backwards and its heading direction controlled by changing the angle of its steered wheels. The second platform is a quadrotor, a flying vehicle, which is an example of a robot that moves in 3-dimensional space. Quadrotors are becoming increasing popular as a robot platform since they can be quite easily modelled and controlled.
However before we start to discuss these two robot platforms it will be helpful to consider some general, but important, concepts regarding mobility.
Peter Corke

Navigation

Abstract
Robot navigation is the problem of guiding a robot towards a goal. The human approach to navigation is to make maps and erect signposts, and at first glance it seems obvious that robots should operate the same way. However many robotic tasks can be achieved without any map at all, using an approach referred to as reactive navigation. For example heading towards a light, following a white line on the ground, moving through a maze by following a wall, or vacuuming a room by following a random path. The robot is reacting directly to its environment: the intensity of the light, the relative position of the white line or contact with a wall. Grey Walter’s tortoise Elsie from page 61 demonstrated “life-like” behaviours - she reacted to her environment and could seek out a light source. Today more than 5 million Roomba vacuum cleaners are cleaning floors without using any map of the rooms they work in. The robots work by making random moves and sensing only that they have made contact with an obstacle.
Peter Corke

Localization

Abstract
In our discussion of map-based navigation we assumed that the robot had a means of knowing its position. In this chapter we discuss some of the common techniques used to estimate the location of a robot in the world - a process known as localization.
Today GPS makes outdoor localization so easy that we often take this capability for granted. Unfortunately GPS is a far from perfect sensor since it relies on very weak radio signals received from distant orbiting satellites. This means that GPS cannot work where there is no line of sight radio reception, for instance indoors, underwater, underground, in urban canyons or in deep mining pits. GPS signals are also extremely weak and can be easily jammed and this is not acceptable for some applications.
Peter Corke

Part III Arm-Type Robots

Frontmatter

Robot Arm Kinematics

Abstract
Kinematics is the branch of mechanics that studies the motion of a body, or a system of bodies, without consideration given to its mass or the forces acting on it. A seriallink manipulator comprises a chain of mechanical links and joints. Each joint can move its outward neighbouring link with respect to its inward neighbour. One end of the chain, the base, is generally fixed and the other end is free to move in space and holds the tool or end-effector.
Figure 7.1 shows two classical robots that are the precursor to all arm-type robots today. Each robot has six joints and clearly the pose of the end-effector will be a complex function of the state of each joint. Section 7.1 describes a notation for describing the link and joint structure of a robot and Sect. 7.2 discusses how to compute the pose of the endeffector. Section 7.3 discusses the inverse problem, how to compute the position of each joint given the end-effector pose. Section 7.4 describes methods for generating smooth paths for the end-effector. The remainder of the chapter covers advanced topics and two complex applications: writing on a plane surface and a four-legged walking robot.
Peter Corke

Velocity Relationships

Abstract
In this chapter we consider the relationship between the rate of change of joint coordinates, the joint velocity, and the velocity of the end-effector. The 3-dimensional end-effector pose ξ ∈ SE(3) has a velocity which is represented by a 6-vector known as a spatial velocity. The joint velocity and the end-effector velocity are related by the manipulator Jacobian matrix which is a function of manipulator pose.
Section 8.1 uses a numerical approach to introduce the manipulator Jacobian. Next we introduce additional Jacobians to transform velocity between coordinate frames and angular velocity between different angular representations. The numerical properties of the Jacobian matrix are shown to provide insight into the dexterity of the manipulator - the directions in which it can move easily and those in which it cannot - and understanding about singular configurations. In Sect. 8.2 the inverse Jacobian is used to generate Cartesian paths without requiring inverse kinematics, and this can be applied to over- and under-actuated robots. Section 8.3 demonstrates how the Jacobian transpose is used to transform forces from the end-effector to the joints and between coordinate frames. Finally, in Sect. 8.4 the numeric inverse kinematic solution, used in the previous chapter, is fully described.
Peter Corke

Dynamics and Control

Abstract
In this chapter we consider the dynamics and control of a serial-link manipulator. Each link is supported by a reaction force and torque from the preceding link, and is subject to its own weight as well as the reaction forces and torques from the links that it supports.
Section 9.1 introduces the equations of motion, a set of coupled dynamic equations, that describe the joint torques necessary to achieve a particular manipulator state. The equations contains terms for inertia, gravity and gyroscopic coupling. The equations of motion provide insight into important issues such as how the motion of one joint exerts a disturbance force on other joints, how inertia and gravity load varies with configuration, and the effect of payload mass. Section 9.2 introduces real-world drive train issues such as gearing and friction. Section 9.3 introduces the forward dynamics which describe how the manipulator moves, that is, how its configuration evolves with time in response to forces and torques applied at the joints by the actuators, and by external forces such as gravity. Section 9.4 introduces control systems that compute the joint forces so that the robot end-effector follows a desired trajectory despite varying dynamic characteristics or joint flexibility.
Peter Corke

Part IV Computer Vision

Frontmatter

Light and Color

Abstract
In ancient times it was believed that the eye radiated a cone of visual flux which mixed with visible objects in the world to create a sensation in the observer, like the sense of touch, the extromission theory. Today we consider that light from an illuminant falls on the scene, some of which is reflected into the eye of the observer to create a perception about that scene. The light that reaches the eye, or the camera, is a function of the illumination impinging on the scene and the material property known as reflectivity.
This chapter is about light itself and our perception of light in terms of brightness and color. Section 10.1 describes light in terms of electro-magnetic radiation and mixtures of light as continuous spectra. Section 10.2 provides a brief introduction to colorimetry, the science of color perception, human trichromatic color perception and how colors can be represented in various color spaces. Section 10.3 covers a number of advanced topics such as color constancy, gamma correction, and an example concerned with distinguishing different colored objects in an image.
Peter Corke

Image Formation

Abstract
In this chapter we discuss how images are formed and captured, the first step in robot and human perception of the world. From images we can deduce the size, shape and position of objects in the world as well as other characteristics such as color and texture.
It has long been known that a simple pin-hole is able to create a perfect inverted image on the wall of a darkened room. Some marine molluscs, for example the Nautilus, have pin-hole camera eyes. All vertebrates have a lens that forms an inverted image on the retina where the light-sensitive cells rod and cone cells, shown previously in Fig. 10.6, are arranged. A digital camera is similar in principle - a glass or plastic lens forms an image on the surface of a semiconductor chip with an array of light sensitive devices to convert light to a digital image.
The process of image formation, in an eye or in a camera, involves a projection of the 3-dimensional world onto a 2-dimensional surface. The depth information is lost and we can no longer tell from the image whether it is of a large object in the distance or a smaller closer object. This transformation from 3 to 2 dimensions is known as perspective projection and is discussed in Sect. 11.1. Section 11.2 introduces the topic of camera calibration, the estimation of the parameters of the perspective transformation. In Sect. 11.2.3 we discuss the inverse problem, how to reconstruct 3-dimensional world points given a 2-dimensional image. Section 11.3 introduces alternative types of cameras capable of wide-angle or panoramic imaging.
Peter Corke

Image Processing

Abstract
Image processing is a computational process that transforms one or more input images into an output image. Image processing is frequently used to enhance an image for human viewing or interpretation, for example to improve contrast. Alternatively, and of more interest to robotics, it is the foundation for the process of feature extraction which will be discussed in much more detail in the next chapter.
Peter Corke

Image Feature Extraction

Abstract
In the last chapter we discussed the acquisition and processing of images. We learnt that images are simply large arrays of pixel values but for robotic applications images have too much data and not enough information. We need to be able to answer pithy questions such as what is the pose of the object? what type of object is it? how fast is it moving? how fast am I moving? and so on. The answers to such questions are measurements obtained from the image and which we call image features. Features are the gist of the scene and the raw material that we need for robot control.
Peter Corke

Using Multiple Images

Abstract
In the previous chapter we learnt about corner detectors which find particularly distinctive points in a scene. These points can be reliably detected in different views of the same scene irrespective of viewpoint or lighting conditions. Such points are characterised by high image gradients in orthogonal directions and typically occur on the corners of objects. However the 3-dimensional coordinate of the corresponding world point was lost in the perspective projection process which we discussed in Chap. 11 - we mapped a 3-dimensional world point to a 2-dimensional image coordinate. All we know is that the world point lies along some ray in space corresponding to the pixel coordinate, as shown in Fig. 11.1. To recover the missing third dimension we need additional information. In Sect. 11.2.3 the additional information was camera calibration parameters plus a geometric object model, and this allowed us to estimate the object’s 3-dimensional pose from the 2-dimensional image data.
Peter Corke

Part V Robotics, Vision and Control

Frontmatter

Vision-Based Control

Abstract
The task in visual servoing is to control the pose of the robot’s end-effector, relative to the target, using visual features extracted from the image. As shown in Fig. 15.1 the camera may be carried by the robot or fixed in the world. The configuration of Fig. 15.1a has the camera mounted on the robot’s end-effector observing the target, and is referred to as end-point closed-loop or eye-in-hand. The configuration of Fig. 15.1b has the camera at a fixed point in the world observing both the target and the robot’s end-effector, and is referred to as end-point open-loop. In the remainder of this book we will discuss only the eye-in-hand configuration.
Peter Corke

Advanced Visual Servoing

Abstract
This chapter builds on the previous one and introduces some advanced visual servo techniques and applications. Section 16.1 introduces a hybrid visual servo method that avoids some of the limitations of the IBVS and PBVS schemes described previously.
Wide-angle cameras such as fisheye lenses and catadioptric cameras have significant advantages for visual servoing. Section 16.2 shows how IBVS can be reformulated for polar rather than Cartesian image-plane coordinates. This is directly relevant to fisheye lenses but also gives improved rotational control when using a perspective camera. The unified imaging model from Sect. 11.4 allows most cameras (perspective, fisheye and catadioptric) to be represented by a spherical projection model, and Sect. 16.3 shows how IBVS can reformulated for spherical coordinates.
The remaining sections present a number of application examples. These illustrate how visual servoing can be used with different types of cameras (perspective and spherical) and different types of robots (arm-type robots, mobile ground robots and flying robots). Section 16.4 considers a 6 degree of freedom robot arm manipulating the camera. Section 16.5 considers a mobile robot moving to a specific pose which could be used for navigating through a doorway or docking. Finally, Sect. 16.6 considers visual servoing of a quadrotor flying robot to hover at fixed pose with respect to a target on the ground.
Peter Corke

Backmatter

Weitere Informationen