OpenCV Primer
Project Details
code
folder and copy it to the micro SD card, which will save you hours building with Yocto. The tools and ipks packages used in this chapter require more space than the SPI images can support, thus a micro SD card is necessary.Materials List
Quantity | Components |
---|---|
1 | Webcam Logitech C270 |
1 | OTG-USB 2.0 adaptor with Micro-USB male to USB A Female (only for Intel Galileo) |
1 | Micro SD card, 4GB to a maximum of 32GB |
USB Video Class
uvcvideo
, which is supported by the BSP SD card software releases. In this case, the device is a simple webcam, but there are other types of devices that support UVC, such as transcoders, video recorders, camcorders, and so on.http://www.ideasonboard.org/uvc/
and check the Supported Devices section, as shown in Figure 7-3
.
uvcvideo
driver. If it does, it will be classified as “Device Works” or “Device Works with Issues.”Preparing the BSP Software Image and Toolchain
/code/SDcard
folder of this chapter and copy it to your micro SD card. Doing so will save you hours.Using eGlibc for Video4Linux Image
.../meta-clanton-distro/recipes-multimedia/v4l2apps/v4l-utils_0.8.8.bbappend
and comment all three lines using your favorite text editor:#FILESEXTRAPATHS_prepend := "${THISDIR}/files:"
#SRC_URI += "file://uclibc-enable.patch"
#DEPENDS += "virtual/libiconv"
Increasing the rootfs Size
rootfs
size..../meta-clanton-distro/recipes-core/image/image-full.bb
file by changing the following lines (see the items in bold):IMAGE_ROOTFS_SIZE = "
507200
"
IMAGE_FEATURES += "package-management
dev-pkgs
"
IMAGE_INSTALL += "autoconf automake binutils binutils-symlinks cpp cpp-symlinks gcc gcc-symlinks g++ g++-symlinks gettext make libstdc++ libstdc++-dev file coreutils"
rootfs
(IMAGE_ROOTFS_SIZE
) is increased to 5GB. A new image feature (IMAGE_FEATURES
) is enhanced with the integration of the development packages (dev-pkgs
). If IMAGE_INSTALL
is added, a series of development tools will be part of the image (g++, make, and so on).Disabling GPU Support on OpenCV
.../meta-oe/meta-oe/recipes-support/opencv/opencv_2.4.3.bb
and .../meta-clanton-distro/recipes-support/opencv/opencv_2.4.3.bbappend
. Make the same changes to both of the lines in EXTRA_OECMAKE
(see the items in bold):EXTRA_OECMAKE = "-DPYTHON_NUMPY_INCLUDE_DIR:PATH=${STAGING_LIBDIR}/${PYTHON_DIR}/site-packages/numpy/core/include \
-DBUILD_PYTHON_SUPPORT=ON \
-DWITH_FFMPEG=ON \
-DWITH_CUDA=OFF \
-DBUILD_opencv_gpu=OFF \
-DWITH_GSTREAMER=OFF \
-DWITH_V4L=ON \
-DWITH_GTK=ON \
-DCMAKE_SKIP_RPATH=ON \
${@bb.utils.contains("TARGET_CC_ARCH", "-msse3", "-DENABLE_SSE=1 -DENABLE_SSE2=1 -DENABLE_SSE3=1 -DENABLE_SSSE3=1", "", d)} \
"
Building the SD Image and Toolchain
cd meta-clanton*
./setup.sh
source poky/oe-init-build-env yocto_build
bitbake
is:bitbake image-full-galileo
bitbake image-full-galileo -c populate_sdk
Development Library Packages
error while loading shared libraries: libopencv_gpu.so.2.4: cannot open shared object file: No such file or directory
ipk
files) individually as well.code
folder contains a tarball named ipk.tar.gz
with all the ipk
files needed for OpenCV and V4L. Copy that file to Intel Galileo and install the libraries using opkg
. To decompress and install the ipk
files for OpenCV and V4L, you use the following command:root@clanton:#
tar -zxvf ipk.tar.gz
root@clanton:#
cd ipk
root@clanton:#
opkg install libopencv-gpu2.4_2.4.3-r2_i586.ipk libopencv-stitching2.4_2.4.3-r2_i586.ipk libopencv-ts2.4_2.4.3-r2_i586.ipk libopencv-videostab2.4_2.4.3-r2_i586.ipk libv4l-dev_0.8.8-r2_i586.ipk libv4l-dbg_0.8.8-r2_i586
Connecting the Webcam
uvcvideo
driver and connect your webcam.root@clanton:∼#
modprobe uvcvideo
[31372.589998] Linux video capture interface: v2.00
[31372.701722] usbcore: registered new interface driver uvcvideo
[31372.707513] USB Video Class driver (1.1.1)
uvcvideo
module driver, it means you have a problem with the custom BSP image. Review the build process or use the micro SD card files provided with this chapter.root@clanton:∼# [31372.707513] USB Video Class driver (1.1.1)[31474.420165] usb 2-1: new high-speed USB device number 3 using ehci-pci
[31474.801403]
uvcvideo: Found UVC 1.00 device <unnamed> (046d:0825)
[31474.930869]
input: UVC Camera
(046d:0825) as /devices/pci0000:00/0000:00:14.3/usb2/2-1/2-1:1.0/input/input2
input:UVC
Camera
confirms the webcam is in compliance with UVC.root@clanton:∼#
ls /dev/video*
/dev/video0
/dev/video0
. The last number (0
in this case) won’t always be 0
. When you connect the webcam the driver can assign any integer number. For example, if you have a USB host and connect two cameras, one might be /dev/video0
and the other might be /dev/video1
. If you keep connecting more and more webcams to the USB host, each one is mapped and then integer will increase, such as /dev/video2
, /dev/video3
, and so on./dev/video0
and for some reason the webcam crashes and was not released properly, the next time you connect, it might be mapped as /dev/video1
.Introduction to Video4Linux
http://linuxtv.org/downloads/v4l-dvb-apis
.Exploring the Webcam Capabilities with V4L2-CTL
-
The encode/pixel formats supported
-
The resolutions supported to capture images
-
The resolutions supported to capture video
-
The frames per second (fps) supported in different encode modes
-
The resolutions that really work
uvcvideo
driver properly. You can type v4l2-ctl --all
to check the current capabilities.root@clanton:∼#
v4l2-ctl --all
Driver Info (not using libv4l2):
Driver name :
uvcvideo
Card type : UVC Camera (046d:0825)
Bus info : usb-0000:00:14.3-1
Driver version: 3.8.7
Capabilities : 0x84000001
Video Capture
Streaming
Device Capabilities
Device Caps : 0x04000001
Video Capture
Streaming
Priority: 2
Video input : 0 (Camera 1: ok)
Format Video Capture:
Width/Height :
640/480
Pixel Format :
'MJPG'
Field : None
Bytes per Line: 0
Size Image : 341333
Colorspace : SRGB
Crop
Capability Video Capture:
Bounds : Left 0, Top 0, Width 640, Height 480
Default : Left 0, Top 0, Width 640, Height 480
Pixel Aspect: 1/1
Streaming Parameters Video Capture:
Capabilities : timeperframe
Frames per second: 30.000 (30/1)
Read buffers : 0
brightness
(int) : min=0 max=255 step=1 default=128
value=128
contrast
(int) : min=0 max=255 step=1 default=32
value=32
saturation
(int) : min=0 max=255 step=1 default=32
value=32
white_balance_temperature_auto
(bool) : default=1
value=1
gain
(int) : min=0 max=255 step=1 default=64
value=192
power_line_frequency (menu) : min=0 max=2 default=2
value=2
white_balance_temperature
(int) : min=0 max=10000 step=10 default=4000
value=1070
flags=inactive
sharpness
(int) : min=0 max=255 step=1 default=24
value=24
backlight_
compensation
(int) : min=0 max=1 step=1 default=0
value=0
exposure
_auto (menu) : min=0 max=3 default=3
value=3
exposure_absolute
(int) : min=1 max=10000 step=1 default=166
value=667 flags=inactive
exposure_auto_priority
(bool) : default=0
value=
1
MJPG
, which is a motion JPEG streamer, and the resolution is 640/480 pixels. The current frame per seconds (fps) is 30 and video cropping is set to the actual video resolution of 640/480, as informed by Crop Capability Video Capture
.Changing and Reading Camera Properties
brightness
, contrast
, and saturation
are used. You can change a property using the --set-ctrl
argument with v4l2-ctl tool. Suppose you want to change the contrast
attribute from 32 to 40. To do so, type the following in your terminal:root@clanton:∼#
v4l2-ctl --set-ctrl=contrast=40
=
:root@clanton:∼#
v4l2-ctl --set-ctrl contrast=40
--get-ctrl
rather than --all
, which lists all the properties. For example:root@clanton:∼#
v4l2-ctl --get-ctrl contrast
contrast: 40
-L
argument to get the list of controls. See the following example:root@clanton:∼#
v4l2-ctl -L
brightness
(int) : min=0 max=255 step=1 default=128
value=128
contrast
(int) : min=0 max=255 step=1 default=32
value=40
saturation
(int) : min=0 max=255 step=1 default=32
value=32
white_balance_temperature_auto
(bool) : default=1
value=1
gain
(int) : min=0 max=255 step=1 default=64
value=64
power_line_frequency
(menu) : min=0 max=2 default=2
value=2
0 : Disabled
1 : 50 Hz
2 : 60 Hz
white_balance_temperature
(int) : min=0 max=10000 step=10 default=4000
value=4000
flags=inactive
sharpness
(int) : min=0 max=255 step=1 default=24
value=24
backlight_compensation
(int) : min=0 max=1 step=1 default=0
value=0
exposure_auto
(menu) : min=0 max=3 default=3
value=3
1 : Manual Mode
3 : Aperture Priority Mode
exposure_absolute
(int) : min=1 max=10000 step=1 default=166
value=166
flags=inactive
exposure_auto_priority
(bool) : default=0
value=1
Pixel Formats and Resolution
v4l2-ctl --list-formats
.root@clanton:∼#
v4l2-ctl --list-formats
ioctl: VIDIOC_ENUM_FMT
Index : 0
Type : Video Capture
Pixel Format:
'YUYV'
Name :
YUV 4:2:2 (YUYV)
Index : 1
Type : Video Capture
Pixel Format:
'MJPG' (compressed)
Name :
MJPEG
YUYV
(index 0) and Motion
JPEG
(index 1) can both capture video as shown by the field Type
.v4l2-ctl --all
command was previously executed, the current settings were pointing to MJPG
.root@clanton:∼#
v4l2-ctl --set-fmt-video width=1920,height=780,pixelformat=0
--list-formats
.--all
), which is used before in order to have summarized information:root@clanton:∼#
v4l2-ctl --get-fmt-video
Format Video Capture:
Width/Height :
1280/720
Pixel Format :
'YUYV'
Field : None
Bytes per Line: 2560
Size Image : 1843200
Colorspace : SRGB
v4l2-ctl --list-formats-ext
command.root@clanton:∼#
v4l2-ctl --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
Index : 0
Type :
Video Capture
Pixel Format:
'YUYV'
Name : YUV 4:2:2 (YUYV)
Size: Discrete 640x480
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 160x120
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
...
...
...
Size: Discrete 1184x656
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 1280x720
Interval: Discrete 0.133 s (7.500 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 1280x960
Interval: Discrete 0.133 s (7.500 fps)
Interval: Discrete 0.200 s (5.000 fps)
Index : 1
Type :
Video Capture
Pixel Format:
'MJPG'
(compressed)
Name : MJPEG
Size: Discrete 640x480
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 160x120
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
...
...
...
Size: Discrete 1184x656
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 1280x720
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
Size: Discrete 1280x960
Interval: Discrete 0.033 s (30.000 fps)
Interval: Discrete 0.040 s (25.000 fps)
Interval: Discrete 0.050 s (20.000 fps)
Interval: Discrete 0.067 s (15.000 fps)
Interval: Discrete 0.100 s (10.000 fps)
Interval: Discrete 0.200 s (5.000 fps)
set-parm
argument. For example, you can set 30 fps using the following:root@clanton:∼#
v4l2-ctl --set-parm=30
Frame rate set to 30.000 fps
Capturing Videos and Images with libv4l2
-
Memory mapped buffers (mmap) : The buffers are allocated in the kernel space. The device determines the number of buffers that can be allocated as the size of the buffers. Usually the
mmap()
functionand the application must inform this method usingV4L2_MEMORY_MMAP
before it queries the buffers. This is the method used with the webcam C270. -
Userspace pointers: The buffers are allocated to the userspace context using the regular
malloc()
orcalloc()
functions. In this case, theV4L2_MEMORY_USERPTR
is used to query the buffer. -
Direct Read/Write: The application in this case can read/writethe buffer directly. Thus, no mapped memory (
mmap
) or userspace memory allocation is necessary (malloc
/calloc
).
A Program for Capturing Video
http://linuxtv.org/downloads/v4l-dvb-apis/capture-example.html
(with a few changes to support the C270 motion JPEG stream).IOCTL
calls. Therefore, it’s a good idea to have a function to do that.static int xioctl(int fh, int request, void *arg)
{
int r;
do {
r =
ioctl
(fh, request, arg);
} while (-1 == r && EINTR == errno);
return r;
}
IOCTL
calls that send IOCTL
to the kernel using the ioctl( ) function./dev
. For example, the string might be "/dev/video0"
. The O_NONBLOCK
option prevents the software from remaining blocked when the buffers are read (this is explained in more detail in the dequeue process in Step 9).static void
open_device(void)
{
...
...
...
fd =
open(dev_name, O_RDWR /* required */ | O_NONBLOCK, 0);
if (-1 == fd) {
fprintf(stderr, "Cannot open '%s': %d, %s\n",
dev_name, errno, strerror(errno));
exit(EXIT_FAILURE);
}
}
VIDIOC_QUERYCAP
.struct v4l2_capability cap
;
...
...
...
...
if (-1 == xioctl(fd,
VIDIOC_QUERYCAP
, &cap)) {
if (EINVAL == errno) {
fprintf(stderr, "%s is no V4L2 device\n",
dev_name);
exit(EXIT_FAILURE);
} else {
errno_exit("VIDIOC_QUERYCAP");
}
}
VIDIOC_S_CROP
set to its default values. If the devices do not support cropping, you can ignore the errors because the image will always have the same resolution.struct v4l2_crop crop;
struct v4l2_cropcap cropcap;
...
...
...
cropcap.type =
V4L2_BUF_TYPE_VIDEO_CAPTURE
;
if (0 == xioctl(fd,
VIDIOC_CROPCAP
, &
cropcap
)) {
crop.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
crop.c = cropcap.defrect; /* reset to default */
if (-1 == xioctl(fd,
VIDIOC_S_CROP
, &crop)) {
switch (errno) {
case
EINVAL
:
/* Cropping not supported. */
break;
default:
/* Errors ignored. */
break;
}
}
}
VIDIOC_S_FMT
is used, the device will assume the setting passed to IOCTL
with the structure v4l2_format
. Otherwise, ifVIDIOC_G_FMT
is used, the current programmed settings are used. In this case you can use the v4l2-ctl tool as explained before. This is the only part of the code that changed from the original code on the V4L2 website. The force_format
variable, when set to true
, forces the format to motion JPEG stream with 1280x720 resolution. Otherwise, use the current setting of your camera, which can be changed using the v4ctl tool.fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
if (
force_format
) {
fmt.fmt.pix.width = 1280;
fmt.fmt.pix.height = 720;
fmt.fmt.pix.pixelformat = V4L2_PIX_FMT_MJPEG;
fmt.fmt.pix.field = V4L2_FIELD_NONE;
if (-1 == xioctl(fd,
VIDIOC_S_FMT
, &fmt))
errno_exit("VIDIOC_S_FMT");
/* Note VIDIOC_S_FMT may change width and height. */
} else {
/* Preserve original settings as set by v4l2-ctl for example */
if (-1 == xioctl(fd,
VIDIOC_G_FMT
, &fmt))
errno_exit("VIDIOC_G_FMT");
}
v4l2_requestbuffers
structure and more precisely using the count
field, it is passed to the device using IOCTL VIDIOC_REQBUFS
, which requires a certain number of buffers to be used by the device to store the images. In the case of webcam C270, the maximum number of buffers is five. More than five and the webcam will report “out of memory” and VIDIOC_REQBUFS
will fail. It’s best to set at least two buffers. If the device accepts the number of buffers asked, due to the VIDIOC_REQBUFS
call, the size of the buffer is reported. Then you allocate the buffers in the userspace context to allow the device to fill them. The allocation might be done using regular functions like calloc()
and malloc()
.struct v4l2_requestbuffers req;
CLEAR(req);
req.count = 5;
req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
req.memory = V4L2_MEMORY_MMAP;
if (-1 == xioctl(fd,
VIDIOC_REQBUFS
, &req)) {
if (EINVAL == errno) {
fprintf(stderr, "%s does not support "
"memory mapping\n", dev_name);
exit(EXIT_FAILURE);
} else {
errno_exit("VIDIOC_REQBUFS");
}
}
if (req.count < 2) {
fprintf(stderr, "Insufficient buffer memory on %s\n",
dev_name);
exit(EXIT_FAILURE);
}
buffers = calloc(req.count, sizeof(*buffers));
if (!buffers) {
fprintf(stderr, "Out of memory\n");
exit(EXIT_FAILURE);
}
VIDIOC_QUERYBUF
. In response to VIDIOC_QUERYBUF
, the offset of the buffer from the start device as the length of each buffer is reported. With this information, the function mmap()
must be called to map the virtual memory that will be shared between the userspace and the device.for (n_buffers = 0; n_buffers < req.count; ++n_buffers)
{
struct v4l2_buffer buf;
CLEAR(buf);
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = n_buffers;
if (-1 == xioctl(fd,
VIDIOC_QUERYBUF
, &buf))
errno_exit("VIDIOC_QUERYBUF");
buffers[n_buffers].length = buf.length;
buffers[n_buffers].start =
mmap(NULL /* start anywhere */,
buf.length,
PROT_READ | PROT_WRITE /* required */,
MAP_SHARED /* recommended */,
fd, buf.m.offset);
if (MAP_FAILED == buffers[n_buffers].start)
errno_exit("mmap");
}
mmap
method, it is necessary to exchange each buffer obtained with VIDIOC_REQBUFS
, with the driver using VIDIOC_QBUF
. The VIDIOC_QBUF
enqueues each buffer.case IO_METHOD_MMAP:
for (i = 0; i < n_buffers; ++i) {
struct v4l2_buffer buf;
CLEAR(buf);
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = i;
if (-1 == xioctl(fd,
VIDIOC_QBUF
, &buf))
errno_exit("VIDIOC_QBUF");
}
VIDIOC_STREAMON
.if (-1 == xioctl(fd,
VIDIOC_STREAMON
, &type))
errno_exit("VIDIOC_STREAMON");
VIDIOC_DQBUF
, the buffers are dequeued and the frames can be read in case of success. Note that there is no specific order for the buffers, so as result beside the data, the buffer index is informed. It is necessary to keep waiting for all the buffers to be dequeued and a while
loop can be implemented for this purpose. However, to avoid the blocking operation in context of the userspace, the select()
function is used. Note that if you try to read a specific buffer that’s not available in memory and if the device was opened with O_NONBLOCK
, the error EAGAIN
is thrown during the VIDIOC_DQBUF
call. Otherwise it remains blocked until the buffer is ready.static void
mainloop(void)
{
unsigned int count;
count = frame_count;
while (count-- > 0) {
for (;;) {
fd_set fds;
struct timeval tv;
int r;
FD_ZERO(&fds);
FD_SET(fd, &fds);
/* Timeout. */
tv.tv_sec = 2;
tv.tv_usec = 0;
r = select(fd + 1, &fds, NULL, NULL, &tv);
if (-1 == r) {
if (EINTR == errno)
continue;
errno_exit("select");
}
if (0 == r) {
fprintf(stderr, "select timeout\n");
exit(EXIT_FAILURE);
}
if (
read_frame())
break;
/* EAGAIN - continue select loop. */
}
}
}
...
...
...
static int read_frame(void)
{
struct v4l2_buffer buf;
unsigned int i;
...
...
...
if (-1 == xioctl(fd,
VIDIOC_DQBUF
, &buf)) {
switch (errno) {
case
EAGAIN
:
printf("EAGAIN\n");
return 0;
case
EIO
:
printf("EIO\n");
/* Could ignore EIO, see spec. */
/* fall through */
default:
printf("default\n");
errno_exit("VIDIOC_DQBUF");
}
}
}
VIDIOC_STREAMOFF
.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
if (-1 == xioctl(fd,
VIDIOC_STREAMOFF
, &type))
errno_exit("VIDIOC_STREAMOFF");
mmap()
must be unmapped with themunmap()
function and the memory allocated to the buffer must be freed with thefree()
function.static void uninit_device(void)
{
unsigned int i;
switch (io) {
...
...
...
case IO_METHOD_MMAP:
for (i = 0; i < n_buffers; ++i)
if (-1
== munmap(buffers[i].start, buffers[i].length))
errno_exit("munmap");
break;
...
...
...
free(buffers);
}
static void close_device(void)
{
if (-1 ==
close(fd))
errno_exit("close");
fd = -1;
}
Building and Transferring the Video Capture Program
mcramon@ubuntu:∼/ $
cd <YOUR BASE TOOLCHAIN PATH>
mcramon@ubuntu:∼/xcompiler$
source environment-setup-*
mcramon@ubuntu:∼/xcompiler$
${CC} --version
i586-poky-linux-gcc (GCC) 4.7.2
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
${CC}
was properly set and you are ready to build the program.${CC}
-O2 -Wall
'pkg-config --cflags --libs libv4l2'
galileo_video_capture.c -o galileo_video_capture
pkg-config
is being used to inform the libv4l2 what will be used in the compilation.galileo_video_capture
program using your favorite program, as explained in Chapter 5. For example, if you are using an Ethernet cable or a WiFi card in Intel Galileo and your operation system is Linux/MacOSX, you can use scp. If your computer is Windows, you can use WinSCP. Here’s an example using scp.root@clanton:∼#
ifconfig eth0 192.254.1.1 netmask 255.255.255.0 up
mcramon@ubuntu
:∼/scp galileo_capture
root@192.254.1.1:/home/root/
Running the Program and Capturing Videos
-
-m
: Memory mapped; used for the C270 webcam -
-u
: Userspace pointers -
-r
: Direct read/write
-
f
. If this argument is not set, the current camera settings are used to capture the video. This means you can change them using the v4l2-ctl tool, as explained.-f
is used, the capture is forced to use a width of 1280, a height of 720, and the pixel format that supports the Motion JPEG stream (MJPEG). This is the only change from the original code, which supports a different format.-
c
. You need to use -c <NUMBER OF FRAMES>
.-o
argument redirects the content captured to an output that can be redirected to a file.video.mjpeg
. For this, you can execute the program with the following arguments using the Intel Galileo terminal shell:root@clanton:∼#
./galileo_video_capture -m -f -c 100 -o > video.mjpeg
....................................................................................................
-f
. Let’s reduce the video resolution.root@clanton:∼#
v4l2-ctl --set-fmt-video width=320,height=176,pixelformat=1
-f
option.root@clanton:∼#
./galileo_video_capture -m -c 100 -o > video2.mpjpeg
....................................................................................................
/dev/video0
, it is necessary to use the -d </dev/video*>
option. For example, suppose your device is enumerated as /dev/video1
. Your command line must then be:root@clanton:∼# ./galileo_video_capture
-d /dev/video1
-m -c 100 -o > video2.mpjpeg
Converting and Playing Videos
ffmpeg
, for three reasons:
-
It’s vastly maintained by the open source community
-
It can run directly on Intel Galileo SD image or on your computer
-
It supports different encoders
ffmpeg
in different operational systems are found at
http://www.ffmpeg.org/download.html
. If your computer runs Linux, the easier way is to install it from static releases present at
http://ffmpeg.gusari.org/static/
and then run the following commands:mcramon@ubuntu∼$:∼/$
mkdir ffmpeg;cd ffmpeg
mcramon@ubuntu∼$:∼/$
wget
http://ffmpeg.gusari.org/static/64bit/ffmpeg.static.64bit.2014-03-02.tar.gz
mcramon@ubuntu:∼/$
tar -zxvf ffmpeg.static.64bit.2014-03-02.tar.gz
ffmpeg
will be available in the same directory you extracted the files.ffmpeg
to convert the videos to a “playable” format in most systems. For example, to convert the first and second videos captured, you can run the following:mcramon@ubuntu:∼/video_samples$
ffmpeg -f mjpeg -i video.mjpeg -c:v copy video.mp4
ffmpeg version N-63717-g4e3fe65 Copyright (c) 2000-2014 the FFmpeg developers
built on Jun 3 2014 01:10:16 with gcc 4.4.7 (Ubuntu/Linaro 4.4.7-1ubuntu2)
configuration: --disable-yasm --enable-cross-compile --arch=x86 --target-os=linux
libavutil 52. 89.100 / 52. 89.100
libavcodec 55. 66.100 / 55. 66.100
libavformat 55. 42.100 / 55. 42.100
libavdevice 55. 13.101 / 55. 13.101
libavfilter 4. 5.100 / 4. 5.100
libswscale 2. 6.100 / 2. 6.100
libswresample 0. 19.100 / 0. 19.100
Input #0, mjpeg, from 'video.mjpeg':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: mjpeg, yuvj422p(pc),
1280x720, 25 fps
, 25 tbr, 1200k tbn, 25 tbc
Output #0, mp4, to 'video.mp4':
Metadata:
encoder : Lavf55.42.100
Stream #0:0: Video: mjpeg (l[0][0][0] / 0x006C), yuvj422p, 1280x720, q=2-31, 25 fps, 1200k tbn, 1200k tbc
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
frame= 100 fps=0.0 q=-1.0 Lsize= 2871kB time=00:00:03.96 bitrate=5939.9kbits/s
video:2870kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.041544%
mcramon@ubuntu:∼/video_samples$
ffmpeg -f mjpeg -i video2.mjpeg -vcodec copy video2.mp4
ffmpeg version N-63717-g4e3fe65 Copyright (c) 2000-2014 the FFmpeg developers
built on Jun 3 2014 01:10:16 with gcc 4.4.7 (Ubuntu/Linaro 4.4.7-1ubuntu2)
configuration: --disable-yasm --enable-cross-compile --arch=x86 --target-os=linux
libavutil 52. 89.100 / 52. 89.100
libavcodec 55. 66.100 / 55. 66.100
libavformat 55. 42.100 / 55. 42.100
libavdevice 55. 13.101 / 55. 13.101
libavfilter 4. 5.100 / 4. 5.100
libswscale 2. 6.100 / 2. 6.100
libswresample 0. 19.100 / 0. 19.100
Input #0, mjpeg, from 'video2.mjpeg':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: mjpeg, yuvj422p(pc),
320x176, 25 fps
, 25 tbr, 1200k tbn, 25 tbc
Output #0, mp4, to 'video2.mp4':
Metadata:
encoder : Lavf55.42.100
Stream #0:0: Video: mjpeg (l[0][0][0] / 0x006C), yuvj422p, 320x176, q=2-31, 25 fps, 1200k tbn, 1200k tbc
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
frame= 100 fps=0.0 q=-1.0 Lsize= 843kB time=00:00:03.96 bitrate=1743.1kbits/s
video:841kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.137993%
-f
indicates that the file is encoded as the Motion JPEG pixel format; the -i
indicates an input file, and -vcodec copy
maintains the same encode and quality, but adds the frames to a MP4 container.code/video_samples
folder of this chapter. Thus you can exercise this conversion independently, with or without a webcam.ffmpeg
, you can see that the frames in general were maintained in the same content and only the headers were added to the end of MP4 file and a few bytes in the beginning. Figure 7-6 shows the header added to the end of the original MJPEG file.
A Program to Capture Images
http://linuxtv.org/downloads/v4l-dvb-apis/v4l2grab-example.html
.-
It is also a command-line program similar to the software used to capture videos that runs in the Intel Galileo’s terminal shell.
-
It only supports the mapped memory IO method, and it is a simplified version of the software used to capture videos. Remember that the software used to capture video was created to cover the whole scenario necessary to communicate with different types of devices. Limiting the software to memory mapped IO devices enables the code to be incredibly simplified. No argument in the command line is necessary because it is hard coded.
-
It accepts different resolutions related to width and height. It accepts the
-W
or--width
arguments for the image’s width and the-H
or--height
arguments for the image’s height. If the options are omitted, the default resolution is 1280x720 (width and height, respectively). If some resolution not supported by the webcam is defined through these arguments, the closest resolution is automatically selected. libv4l2 compares the ones supported by the camera and a warning message is displayed to the user in the terminal shell. -
It can select two different encodes, YUYV or RGB24. If the
-y
or--yuyv
arguments are used, the YUYV encode is used; otherwise if this argument is omitted, the RGB24 encode is used by default. -
It is possible to set the number of images that will be stored in the files using the
-c
or--count
argument, followed by the number of images desired. If this option is omitted, the number of images stored in the file system will be 10 by default. The image’s names will have the prefixout
, followed by three decimals formatting the image order with the extension.ppm
. For example,out000.ppm
,out001.ppm
, and so on.
v4l2-ctl --list-formats
that the webcam C270 only supports MJPEG and YUYV? Actually, RGB24 is not supported by the webcam C270, but libv4l2 supports to conversion from YUYV to RGB24 and BGR24. In other words, even when you set the pixel format to RGB24 or BGR24, more precisely V4L2_PIX_FMT_RGB24
and V4L2_PIX_FMT_BGR24
, if your camera does not offer such formats natively, YUYV is considered and the V4L2 library makes the conversion from YUYV to RGB24 or BGR24 for you.isYUYV
.CLEAR(fmt);
fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
fmt.fmt.pix.width = width;
fmt.fmt.pix.height = height;
if (!isYUYV)
{
printf("Encode RGB24\n");
fmt.fmt.pix.pixelformat = V4L2_PIX_FMT_RGB24;
}
else
{
printf("Encode YUYV\n");
fmt.fmt.pix.pixelformat = V4L2_PIX_FMT_YUYV;
}
fmt.fmt.pix.field = V4L2_FIELD_INTERLACED;
xioctl(fd, VIDIOC_S_FMT, &fmt);
if (fmt.fmt.pix.pixelformat != V4L2_PIX_FMT_RGB24 &&
fmt.fmt.pix.pixelformat != V4L2_PIX_FMT_YUYV) {
printf("Libv4l didn't accept RGB24 or YUYV format. Can't proceed.\n");
exit(EXIT_FAILURE);
}
if ((fmt.fmt.pix.width != width) || (fmt.fmt.pix.height != height))
printf("Warning: driver is sending image at %dx%d\n",
fmt.fmt.pix.width, fmt.fmt.pix.height);
P6
is called the “magic identifier” and it can be P3
as well. Then the next line contains the image width and height, represented by 1280 and 720 in this example.fopen()
and write this string sequence in this file using fprintf()
.fout =
fopen(out_name, "w");
...
...
...
fprintf(fout, "P6\n%d %d 255\n", fmt.fmt.pix.width, fmt.fmt.pix.height);
fprintf()
function.VIDIOC_DQBUF
.xioctl(fd, VIDIOC_STREAMON, &type);
for (i = 0; i <
images_count
; i++) {
do {
...
...
...
CLEAR(buf);
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
xioctl(fd, VIDIOC_DQBUF, &buf);
sprintf(out_name, "out%03d.ppm", i);
printf("Creating image: %s\n", out_name);
fout =
fopen(out_name, "w");
if (!fout) {
perror("Cannot open image");
exit(EXIT_FAILURE);
}
fprintf(fout, "P6\n%d %d 255\n", fmt.fmt.pix.width, fmt.fmt.pix.height);
...
...
...
fwrite(buffers[buf.index].start, buf.bytesused, 1, fout);
...
...
...
fclose(fout);
xioctl(fd, VIDIOC_QBUF, &buf);
}
// each pixel 3 bytes in RGB 24
int size = fmt.fmt.pix.width * fmt.fmt.pix.height * sizeof(char) * 3;
unsigned char * data = (unsigned char *) malloc(size);
yuyv_to_rgb24()
and it is from the file cvcap_v4l.cpp
. For reference, you can see the whole file at
https://code.ros.org/trac/opencv/browser/trunk/opencv/src/highgui/cvcap_v4l.cpp?rev=284
.fmt.fmt.pix.width
and fmt.fmt.pix.height
), the initial buffer address (buffers[buf.index].start
) to the current frame returned, and the destination buffer allocated (data
).yuyv_to_rgb24(fmt.fmt.pix.width,
fmt.fmt.pix.height,
(unsigned char*)(buffers[buf.index].start),
data);
fwrite(data, size, 1, fout);
free (data);
...
...
...
fclose(fout);
Building and Transferring the Picture Grabber
${CC} -O2 -Wall 'pkg-config --cflags --libs libv4l2' picture_grabber.c -o picture_grabber
Running the Program and Capturing Images
root@clanton:∼#
./picture_grabber -W 352 -H 288 -c 5
Encode RGB24
Creating image: out000.ppm
Creating image: out001.ppm
Creating image: out002.ppm
Creating image: out003.ppm
Creating image: out004.ppm
out
and extension ppm
are created. Copy these images to your computer and open them using an image viewer.root@clanton:∼#
./picture_grabber -W 300 -H 200 -c 5
Encode RGB24
Warning: driver is sending image at 176x144
Creating image: out000.ppm
Creating image: out001.ppm
Creating image: out002.ppm
Creating image: out003.ppm
Creating image: out004.ppm
-y
or --yuyv
argument to it:root@clanton:∼#
./picture_grabber -W 352 -H 288 -c 5 -y
Encode YUYV
Creating image: out000.ppm
Creating image: out001.ppm
Creating image: out002.ppm
Creating image: out003.ppm
Creating image: out004.ppm
Working with OpenCV
Building Programs with OpenCV
opencv_capimage.cpp
, use the following line:${CXX} -O2 'pkg-config --cflags --libs opencv' opencv_capimage.cpp -o opencv_capimage
${CXX}
invokes the C++ compiler (g++) of the toolchain and pkg-config
invokes the opencv
libs.Capturing an Image with OpenCV
#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
VideoCapture cap(-1);
//check if the file was opened properly
if(!cap.isOpened())
{
cout << "Webcam could not be opened succesfully" << endl;
exit(-1);
}
else
{
cout << "p n" << endl;
}
int w =
960
;
int h =
544
;
cap.set(CV_CAP_PROP_FRAME_WIDTH, w);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, h);
Mat frame;
cap >>frame;
imwrite("opencv.jpg", frame);
cap.release();
return 0;
}
Reviewing opencv_capimage.cpp
VideoCapture
is used to create the capture objects that open and configure the devices, capture images and videos, and release the devices when they are not in use any more. Mat
receives frames read and works with some algorithms to process the images. It can apply filters, change colors, and transform the images according to mathematical and statistical algorithms. In the next example, Mat
is used only to read the image. However, in the next couple of examples, Mat
will be used to process images as well.VideoCapture:: VideoCapture
VideoCapture
class to create a video capture object and open the device or some video stored in the files system.VideoCapture
class, see
http://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.html
.-1
in the constructor, as follows:VideoCapture cap(-1);
-1
means, “open the current device enumerated in the system,” so if you have the camera enumerated as /dev/video0
or /dev/video1
, the webcam will be opened anyway. Otherwise, if you want to be specific regarding which device to open, you have to pass to the constructor the index of the enumerated device. For example, to open the device /dev/video0
, you must pass the number 0
to the constructor like this:VideoCapture cap(0);
-1
to avoid problems with camera enumeration indexes versus the hardcoded number you use in the constructor.VideoCapture::isOpened()
isOpened()
method. It returns a Boolean as true
if the webcam was opened and false
if not.VideoCapture::set(const int prop, int value)
prop
) to a specific value (value
). You can set the image’s width, height, frames per second, and several other properties. In the code example, the video width and height are set to 960x544:int w =
960
;
int h =
544
;
cap.set(CV_CAP_PROP_FRAME_WIDTH, w);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, h);
http://nullege.com/codes/search/opencv.highgui.CV_CAP_PROP_FPS
.VideoCapture::read(Mat & image) or operator >> (Mat & image)
Mat
object that is explained shortly.>>
:Mat frame;
cap >>frame;
VideoCapture::release( )
release()
method.cap.release();
cv::Mat::Mat
Mat
is an awesome class used for matrix operations and it is constantly used in OpenCV applications. Mat
is used to organize images in the format of matrixes responsible for saving details of each pixel, including color intensity, position in the image, image dimension, and so on.Mat
class is organized into two parts—one part contains the image headers with generic information about the image and the second part contains the sequence of bytes representing the image.Mat
is called only as Mat
instead of as cv::Mat
because the namespace was defined in the beginning of the code:using namespace cv;
Mat
object created with the simple constructors available in the class:Mat frame;
Mat
class is for and this simple constructor.Mat
class, visit
http://docs.opencv.org/modules/core/doc/basic_structures.html#mat-mat
. The tutorial maintained by docs.opencv.org
is also recommended at
http://docs.opencv.org/doc/tutorials/core/mat_the_basic_image_container/mat_the_basic_image_container.html
.cv::imwrite( const string& filename, InputArray img, const vector<int>& params=vector<int>())
opencv.jpg
, the input array is intrinsically casted as my Mat
class with the object frame, and the optional vector of theparams
argument is omitted.Mat frame;
cap >>frame;
imwrite("opencv.jpg", frame);
params
argument, the encode used to capture the image is based on the file extension .jpg
. Remember that the camera does not support capturing images in the JPEG format. It captures streaming in Motion JPEG but JPEG is not extracted from Motion JPEG because there is segment called DHT that’s not present in this stream (check out
http://www.digitalpreservation.gov/formats/fdd/fdd000063.shtml
). You can extract a series of JPEG images using ffmpeg
from a Motion JPEG streaming file, but they will not be viewable in any image software due to the DHT segment missing.docs.opencv.org
site maintains a nice tutorial about how to load, modify, and save an image at
http://docs.opencv.org/doc/tutorials/introduction/load_save_image/load_save_image.html
.Running opencv_capimage.cpp
uvcvideo
driver is loaded and the webcam is connected to the USB port (read the section called “Connecting the Webcam” in this chapter). Finally, smile at your webcam and run the software:root@clanton:∼#
./opencv_capimage
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
Webcam is OK! I found it!
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
VIDIOC_QUERYMENU: Invalid argument
opencv.jpg
in the same folder. Now, you might asking what the VIDIOC_QUERYMENU: Invalid argument
message mean. Such messages are not related to OpenCV and there is nothing wrong with the code. It is simply OpenCV using the Video4Linux framework to understand the capabilities and controls offered by the webcam. When some control or capability is not offered, V4L informs you with these warning messages.stderr
stream to a null device. For example:root@clanton:∼#
./opencv_capimage 2> /dev/null
Webcam is OK! I found it!
The Same Software Written in Python
import cv2
import cv
import sys
cap = cv2.VideoCapture(-1)
w, h =
960, 544
cap.set(cv.CV_CAP_PROP_FRAME_WIDTH, w)
cap.set(cv.CV_CAP_PROP_FRAME_HEIGHT, h)
if not cap.isOpened():
print "Webcam could not be opened successfully"
sys.exit(-1)
else:
print "Webcam is OK! I found it!"
ret, frame = cap.read()
cv2.imwrite('pythontest.jpg', frame)
cap.release()
root@clanton:∼#
python opencv_capimage.py 2> /dev/null
Webcam is OK! I found it!
Performance of OpenCV C++ versus OpenCV Python
date +%s
, which returns the number of seconds passed since 00:00:00 1970-01-01 UTC. Execute the program and evaluate the time difference.root@clanton:∼#
s=$(date +%s);python opencv_capimage.py; echo $(expr 'date +%s' - $s)
Webcam is OK! I found it!
8
root@clanton:∼#
s=$(date +%s);./opencv_capimage 2> /dev/null; echo $(expr 'date +%s' - $s)
Webcam is OK! I found it!
4
Processing Images
Detecting Edges
Canny()
that implements such an algorithm. For details about this algorithm, see
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html
.#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
VideoCapture cap(-1);
//check if the file was opened properly
if(!cap.isOpened())
{
cout << "Webcam could not be opened succesfully" << endl;
exit(-1);
}
else
{
cout << "Webcam is OK! I found it!\n" << endl;
}
int w = 960;
int h = 544;
cap.set(CV_CAP_PROP_FRAME_WIDTH, w);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, h);
Mat frame;
cap >>frame;
// converts the image to grayscale
Mat frame_in_gray;
cvtColor(frame, frame_in_gray, CV_BGR2GRAY);
// process the Canny algorithm
cout << "processing image with Canny..." << endl;
int threshold1 = 0;
int threshold2 = 28;
Canny(frame_in_gray, frame_in_gray, threshold1, threshold1);
// saving the images in the files system
cout << "Saving the images..." << endl;
imwrite("captured.jpg", frame);
imwrite("captured_with_edges.jpg", frame_in_gray);
// release the camera
cap.release();
return 0;
}
Reviewing opencv_capimage_canny.cpp
cvtColor()
is added.Canny()
function is used for image processing.captured.jpg
and captured_with_edges.jpg
using the imwrite()
function explained previously.void cv::cvtColor(InputArray src, OutputArray dst, int code, int dstCn=0)
Mat frame_in_gray;
cvtColor(frame, frame_in_gray, CV_BGR2GRAY);
Mat
object frame. The frame_in_gray
object was created to receive the image converted in gray space color as requested by code CV_BGR2GRAY
.cvtColor()
function and color in general, visit
http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#cvtcolor
.void cv::Canny(InputArray image, OutputArray edges, double threshold1, double threshold2, int apertureSize=3, bool L2gradient=false)
edges
. The input and output image in the example is the same object (frame_in_gray);
for best effect, a grayscale image is used.apertureSize
argument is the size of the Sobel operator used in the algorithm (see
http://en.wikipedia.org/wiki/Sobel_operator
for more details) and the code keeps the default value of 3.L2gradient
argument is a Boolean; when it’s true
, the image gradient magnitude
is used and when it’s false
, only the normative equation is considered. This example used the default value of false
.threshold1
and threshold2
and the values 0 and 28 were used, respectively. These values are based on my experiments with changing these values until I got results I considered good. You can change these values and check the effects you get.int threshold1 = 0;
int threshold2 = 28;
Canny(frame_in_gray, frame_in_gray, threshold1, threshold1);
Running opencv_capimage_canny.cpp
uvcvideo
driver is loaded and the webcam is connected to the USB port (read the section entitled “Connecting the Webcam” in this chapter). Point your webcam to some object rich in edges, like the image shown in Figures 7-8 and 7-9.root@clanton:∼#
./opencv_capimage_canny 2> /dev/null
Webcam is OK! I found it!
processing image with Canny...
Saving the images...
Face and Eyes Detection
CascadeClassifier
.haarcascade_frontalface_alt.
xml
and haarcascade_eye.
xml
—during the creation of the CascadeClassifier
objects. Each file brings a series of models that defines how specific objects are represented in the image based on the sum of intensity of pixels in a series of rectangles. The difference of these sums is evaluated in the image. Both files have characteristics about faces and eyes from an image and the class CascadeClassifier
can determine the detections when the method detectMultiScale()
is invoked.CascadeClassifier()
, visit
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html?highlight=cascadeclassifier#cascadeclassifier
.rectangle()
and circle()
.#include <opencv2/opencv.hpp>
#include "opencv2/core/core.hpp"
using namespace cv;
using namespace std;
String face_cascade_name = "haarcascade_frontalface_alt.xml";
String eye_cascade_name = "haarcascade_eye.xml";
void faceDetect(Mat img);
CascadeClassifier face_cascade;
CascadeClassifier eyes_cascade;
using namespace cv;
using namespace std;
int main(int argc, const char *argv[])
{
if( !face_cascade.load( face_cascade_name ) )
{
cout << face_cascade_name << " not found!! aborting..." << endl;
exit(-1);
};
if( !eyes_cascade.load( eye_cascade_name ) )
{
cout << eye_cascade_name << " not found!! aborting..." << endl;
exit(-1);
};
// 0 is the ID of the built-in laptop camera, change if you want to use other camera
VideoCapture cap(-1);
//check if the file was opened properly
if(!cap.isOpened())
{
cout << "Capture could not be opened succesfully" << endl;
return -1;
}
else
{
cout << "camera is ok\n" << endl;
}
int w = 432;
int h = 240;
cap.set(CV_CAP_PROP_FRAME_WIDTH, w);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, h);
Mat frame;
cap >>frame;
cout << "processing the image...." << endl;
faceDetect(frame);
imwrite("face_and_eyes.jpg", frame);
// release the camera
cap.release();
cout << "done!" << endl;
return 0;
}
void faceDetect(Mat img)
{
std::vector<Rect> faces;
std::vector<Rect> eyes;
bool two_eyes = false;
bool any_eye_detected = false;
//detecting faces
face_cascade.detectMultiScale( img, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
if (faces.size() == 0)
{
cout << "Try again.. I did not dectected any faces..." << endl;
return;
}
// it is possible to face more than one human face in the image
for( size_t i = 0; i < faces.size(); i++ )
{
// rectangle in the face
rectangle( img, faces[i], Scalar( 255, 100, 0 ), 4, 8, 0 );
Mat frame_gray;
cvtColor( img, frame_gray, CV_BGR2GRAY );
// croping only the face in region defined by faces[i]
std::vector<Rect> eyes;
Mat faceROI = frame_gray( faces[i] );
// In each face, detect eyes
eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );
for( size_t j = 0; j < eyes.size(); j++ )
{
Point center( faces[i].x + eyes[j].x + eyes[j].width*0.5, faces[i].y + eyes[j].y + eyes[j].height*0.5 );
int radius = cvRound( (eyes[j].width + eyes[j].height)*0.25 );
circle( img, center, radius, Scalar( 255, 0, 0 ), 4, 8, 0 );
}
}
}
Reviewing opencv_face_and_eyes_detection.cpp
-
Introduction of the
CascadeClassifier
class. -
Usage of the
Point
class -
The
cvRound()
function -
Usage of the
rectangle()
andcircle()
functions -
The
Rect
class and vectors
cv::CascadeClassifier::CascadeClassifier( )
CascadeClassifier
object. In the example code, two objects are created, one to detect the face and the other to detect the eyes.CascadeClassifier face_cascade;
CascadeClassifier eyes_cascade;
cv::CascadeClassifier::load(const string & filename)
if( !face_cascade.load( face_cascade_name ) )
{
cout << face_cascade_name << " not found!! aborting..." << endl;
exit(-1);
};
if( !eyes_cascade.load( eye_cascade_name ) )
{
cout << eye_cascade_name << " not found!! aborting..." << endl;
exit(-1);
};
void cv::CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, SizeminSize=Size(), Size maxSize=Size())
detectMultiScale()
method is where the magic happens in terms of detections. A description of each argument follows:
-
image
is the image source. -
vector<Rect>& objects
is a vector of rectangles and is where the object detects are in the image. -
scaleFactor
is a factor that determine if the image is reduced. -
minNeighbors
determines how many neighbors each candidate rectangle has. If0
is passed, there is a risk of other objects in the image being detected incorrectly, which results in false positives in the detection. For example, if you have a clock on your wall it might be detected as a face (a false positive). During my practical experiments, specifying 2 or 3 is good. More than 3 and there is a risk of losing true positives and faces not being detected properly. -
Flags
is related to the type of optimization.CV_HAAR_SCALE_IMAGE
tells the algorithm to be in charge of the scaled image. This flags acceptsCV_HAAR_DO_CANNY_PRUNNING
, which skips flat regions,CV_HAAR_FIND_BIGGEST_OBJECT
if there is interest in finding the biggest object in the image, andCV_HAAR_DO_ROUGH_SEARCH
, which must be used only withCV_HAAR_FIND_BIGGEST_OBJECT lile "0|CV_HAAR_DO_ROUGH_SEARCH |CV_HAAR_FIND_BIGGEST_OBJECT"
. -
SizeminSize
defines the minimum object size and objects smaller than this are ignored. If it’s not defined this argument is not considered. -
maxSize
defines the maximum object size and objects bigger than this are ignored. If it’s not defined this argument is not considered.
//detecting faces
face_cascade.detectMultiScale( img, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
...
...
...
//In each face, detect eyes
eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30)
minNeighbors
is 2 (a kind of hint), the flags are optimized for performance using CV_HAAR_SCALE_IMAGE
, and the minimum size of the object to detect is 30x30 pixels. No maximum size is defined, so you can put your face very close to the webcam.// rectangle in the face
rectangle( img, faces[i], Scalar( 255, 100, 0 ), 4, 8, 0 );
vector<Rect> faces
. For example, faces[0]
is the first face in the picture. If there is more than one person, you will have face[1]
, face[2]
, and so on. The object type Rect
means rectangle so the faces vector is a group of rectangles without graphical objects. They are objects that store the initial coordinates (upper-left points) in (Rec.x,Rec.y)
and the width (Rec.w)
and height (Rec.h)
of the rectangle in the object
class.cvColor()
function.Mat frame_gray;
cvtColor( img, frame_gray, CV_BGR2GRAY );
// croping only the face in region defined by faces[i]
std::vector<Rect> eyes;
Mat faceROI = frame_gray( faces[i] );
// In each face, detect eyes
eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );
vector<Rect> eyes
.for
loops in this code:for( size_t i = 0; i < faces.size(); i++ )
{
...
...
...
for( size_t j = 0; j < eyes.size(); j++ )
{
...
...
... }
}
Point
class was used. It extracts information from vector<Rect> eyes
and stores the exact center of the eyes (the central coordinates):Point center
( faces[i].x + eyes[j].x + eyes[j].width*0.5, faces[i].y + eyes[j].y + eyes[j].height*0.5 );
int radius =
cvRound
( (eyes[j].width + eyes[j].height)*0.25 );
circle( img,
center
,
radius
, Scalar( 255, 0, 0 ), 4, 8, 0 );
Point center
object is based on the rectangle’s dimension of the current face, identified by the center point of the eye and the variable radius
. Using the function cvRound()
, it determines the radius to be drawn around the eyes.circle()
.Running opencv_face_and_eyes_detection.cpp
uvcvideo
driver was loaded and the webcam is connected to the USB port (read the section called “Connecting the Webcam” in this chapter) and copy the haarcascade_frontalface_alt.xml
and haarcascade_eye.xml
files to the same location you transferred the executable program. Stay in front of camera and look in the direction of the lens. Then run the software:root@clanton:∼#
./opencv_face_and_eyes_detection 2> /dev/null
camera is ok
processing the image....
done!
face_and_eyes.jpg
is created in the files system wilt all faces and eyes detected, as shown in Figure 7-11.
Emotions Classification
http://docs.opencv.org/trunk/modules/contrib/doc/facerec/tutorial/facerec_gender_classification.html
. Phillip Wagner kindly granted permission for the code adaptation and the techniques explored, keeping all the code under the BSD licenses as his original work in this book.-
Run on Intel Galileo and classify emotions instead of genders.
-
Use faces and eyes detection directly from the images captured by the webcam.
-
Crop the images dynamically based on human anatomy.
-
Happy
-
Sad
-
Curious
fisherface
.Preparing the Desktop
pillow
and setuptools
modules installed.Pillow
is used to treat images using Python scripts and setuptools
is a dependence that pillow requests. You should install the setuptools
module first.Pillow
can be downloaded from
https://pypi.python.org/pypi/Pillow
and the setuptools
module can be downloaded from
https://pypi.python.org/pypi/setuptools
. Both sites include information on how to install these modules on Linux, Windows, and MacOSX.http://www.gimp.org/
.Creating the Database
Obtaining the Initial Images
initial_pictures
subfolder of the code
folder of this chapter, there are some pictures of me of each emotion. For each picture the pixel coordinates of the center of my eyes were taken—see Table 7-2.Picture | Left Eye Center (x,y) | Right Eye Center (x,y) |
---|---|---|
serious_01.jpg
| 528, 423 | 770, 431 |
serious_02.jpg
| 522,412 | 758, 415 |
serious_03.jpg
| 518, 423 | 754, 425 |
smile_01.jpg
| 516, 377 | 753, 379 |
smile_02.jpg
| 533, 374 | 763, 380 |
smile_03.jpg
| 518, 379 | 749, 381 |
surprised_01.jpg
| 516,356 | 754,355 |
surprised_02.jpg
| 548, 364 | 793, 364 |
surprised_03.jpg
| 528, 377 | 770, 378 |
Cropping the Images
#!/usr/bin/env python
# Software License Agreement (BSD License)
#
# Copyright (c) 2012, Philipp Wagner
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above
# copyright notice, this list of conditions and the following
# disclaimer in the documentation and/or other materials provided
# with the distribution.
# * Neither the name of the author nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
# Manoel Ramon 06/11/2014- changed the code to support images used
# as example of emotion classification
#
import sys, math, Image
def Distance(p1,p2):
dx = p2[0] - p1[0]
dy = p2[1] - p1[1]
return math.sqrt(dx*dx+dy*dy)
def ScaleRotateTranslate(image, angle, center = None, new_center = None, scale = None, resample=Image.BICUBIC):
if (scale is None) and (center is None):
return image.rotate(angle=angle, resample=resample)
nx,ny = x,y = center
sx=sy=1.0
if new_center:
(nx,ny) = new_center
if scale:
(sx,sy) = (scale, scale)
cosine = math.cos(angle)
sine = math.sin(angle)
a = cosine/sx
b = sine/sx
c = x-nx*a-ny*b
d = -sine/sy
e = cosine/sy
f = y-nx*d-ny*e
return image.transform(image.size, Image.AFFINE, (a,b,c,d,e,f), resample=resample)
def CropFace(image, eye_left=(0,0), eye_right=(0,0), offset_pct=(0.2,0.2), dest_sz = (70,70)):
# calculate offsets in original image
offset_h = math.floor(float(offset_pct[0])*dest_sz[0])
offset_v = math.floor(float(offset_pct[1])*dest_sz[1])
# get the direction
eye_direction = (eye_right[0] - eye_left[0], eye_right[1] - eye_left[1])
# calc rotation angle in radians
rotation = -math.atan2(float(eye_direction[1]),float(eye_direction[0]))
# distance between them
dist = Distance(eye_left, eye_right)
# calculate the reference eye-width
reference = dest_sz[0] - 2.0*offset_h
# scale factor
scale = float(dist)/float(reference)
# rotate original around the left eye
image = ScaleRotateTranslate(image, center=eye_left, angle=rotation)
# crop the rotated image
crop_xy = (eye_left[0] - scale*offset_h, eye_left[1] - scale*offset_v)
crop_size = (dest_sz[0]*scale, dest_sz[1]*scale)
image = image.crop((int(crop_xy[0]), int(crop_xy[1]), int(crop_xy[0]+crop_size[0]), int(crop_xy[1]+crop_size[1])))
# resize it
image = image.resize(dest_sz, Image.ANTIALIAS)
return image
if __name__ == "__main__":
#Serious_01.jpg
#left -> 528, 423
#right -> 770, 431
image = Image.open("serious_01.jpg")
CropFace(image, eye_left=(528,423), eye_right=(770,431), offset_pct=(0.2,0.2)).save("serious01_20_20_70_70.jpg")
#Serious_02.jpg
#left -> 522,412
#right -> 758, 415
image = Image.open("serious_02.jpg")
CropFace(image, eye_left=(522,412), eye_right=(758,415), offset_pct=(0.2,0.2)).save("serious02_20_20_70_70.jpg")
#Serious_03.jpg
#left -> 518, 423
#right -> 754, 425
image = Image.open("serious_03.jpg")
CropFace(image, eye_left=(518,423), eye_right=(754,425), offset_pct=(0.2,0.2)).save("serious03_20_20_70_70.jpg")
#Smile_01.jpg
#left -> 516, 377
#right -> 753, 379
image = Image.open("smile_01.jpg")
CropFace(image, eye_left=(516,377), eye_right=(753,379), offset_pct=(0.2,0.2)).save("smile01_20_20_70_70.jpg")
#Smile_02.jpg
#left -> 533, 374
#right -> 763, 380
image = Image.open("smile_02.jpg")
CropFace(image, eye_left=(533,374), eye_right=(763,380), offset_pct=(0.2,0.2)).save("smile02_20_20_70_70.jpg")
#Smile_03.jpg
#left -> 518, 379
#right -> 749, 381
image = Image.open("smile_03.jpg")
CropFace(image, eye_left=(518,379), eye_right=(749,381), offset_pct=(0.2,0.2)).save("smile03_20_20_70_70.jpg")
#surprised_01.jpg
#left -> 516,356
#right -> 754,355
image = Image.open("surprised_01.jpg")
CropFace(image, eye_left=(516,356), eye_right=(754,355), offset_pct=(0.2,0.2)).save("surprised01_20_20_70_70.jpg")
#surprised_02.jpg
#left -> 548, 364
#right -> 793, 364
image = Image.open("surprised_02.jpg")
CropFace(image, eye_left=(548,364), eye_right=(793,364), offset_pct=(0.2,0.2)).save("surprised02_20_20_70_70.jpg")
#surprised_03.jpg
#left -> 528, 377
#right -> 770, 378
image = Image.open("surprised_03.jpg")
CropFace(image, eye_left=(528,377), eye_right=(770,378), offset_pct=(0.2,0.2)).save("surprised03_20_20_70_70.jpg")
mcramon@ubuntu:∼/tmp/opencv/emotion/mypics$
python align_faces.py
_20_20_70_70
is created:mcramon@ubuntu:∼/tmp/opencv/emotion/mypics$
ls *20*
serious01_20_20_70_70.jpg smile01_20_20_70_70.jpg surprised01_20_20_70_70.jpg
serious02_20_20_70_70.jpg smile02_20_20_70_70.jpg surprised02_20_20_70_70.jpg
serious03_20_20_70_70.jpg smile03_20_20_70_70.jpg surprised03_20_20_70_70.jpg
pillow
module to create an image object that, using the CropFace()
function, crops the image according to the scale reduction. For example, to crop the image file surprised_02.jpg
to a scale of 20% x 20%, the following line of code is necessary:image = Image.open("surprised_02.jpg")
CropFace(image, eye_left=(548,364), eye_right=(793,364), offset_pct=(0.2,0.2)).save("surprised02_20_20_70_70.jpg")
scp
. Run the following in the command line in the directory containing your images:mcramon@ubuntu:∼/tmp/opencv/emotion/mypics$
for i in $(ls *20*);do scp $i root@192.254.1.1:/home/root/. ;done
/home/root
directory.Organizing the Images in Directories
mkdir
command to create the serious
,
smile
, and surprised
directories. Move each picture with the mv
command to the corresponding directory. The result is something like this:.
├──
serious
│ ├──
serious01_20_20_70_70.jpg
│ ├──
serious02_20_20_70_70.jpg
│ └──
serious03_20_20_70_70.jpg
├──
smile
│ ├──
smile01_20_20_70_70.jpg
│ ├──
smile02_20_20_70_70.jpg
│ └──
smile03_20_20_70_70.jpg
└──
surprised
├──
surprised01_20_20_70_70.jpg
├──
surprised02_20_20_70_70.jpg
└──
surprised03_20_20_70_70.
jpg
Creating the CSV File
/home/root/emotion/pics/
smile
/smile01_20_20_70_70.jpg
;0
/home/root/emotion/pics/
smile
/smile02_20_20_70_70.jpg
;0
/home/root/emotion/pics/
smile
/smile03_20_20_70_70.jpg
;0
/home/root/emotion/pics/
surprised
/surprised01_20_20_70_70.jpg
;1
/home/root/emotion/pics/
surprised
/surprised02_20_20_70_70.jpg
;1
/home/root/emotion/pics/
surprised
/surprised03_20_20_70_70.jpg
;1
/home/root/emotion/pics/
serious
/serious01_20_20_70_70.jpg
;2
/home/root/emotion/pics/
serious
/serious02_20_20_70_70.jpg
;2
/home/root/emotion/pics/
serious
/serious03_20_20_70_70.jpg
;2
;
with an index that represents the emotional state of the picture. In Listing 7-6, 0
represents smiling, 1
represents surprise, and 2
represents seriousness.#!/usr/bin/env python
# Software License Agreement (BSD License)
#
# Copyright (c) 2012, Philipp Wagner
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above
# copyright notice, this list of conditions and the following
# disclaimer in the documentation and/or other materials provided
# with the distribution.
# * Neither the name of the author nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import sys
import os.path
# This is a tiny script to help you creating a CSV file from a face
# database with a similar hierarchie:
#
# philipp@mango:∼/facerec/data/at$ tree
# .
# |-- README
# |-- s1
# | |-- 1.pgm
# | |-- ...
# | |-- 10.pgm
# |-- s2
# | |-- 1.pgm
# | |-- ...
# | |-- 10.pgm
# ...
# |-- s40
# | |-- 1.pgm
# | |-- ...
# | |-- 10.pgm
#
if __name__ == "__main__":
if len(sys.argv) != 2:
print "usage: create_csv <base_path>"
sys.exit(1)
BASE_PATH=sys.argv[1]
SEPARATOR=";"
label = 0
for dirname, dirnames, filenames in os.walk(BASE_PATH):
for subdirname in dirnames:
subject_path = os.path.join(dirname, subdirname)
for filename in os.listdir(subject_path):
abs_path = "%s/%s" % (subject_path, filename)
print "%s%s%d" % (abs_path, SEPARATOR, label)
label = label + 1
python create_csv.py
<the ABSOLUTE directory path> > <your file name>
root@clanton:∼/emotion# python create_csv.py $(pwd)/pics/ > my_csv.csv
root@clanton
:∼/emotion# cat my_csv.csv
/home/root/emotion/pics/smile/smile01_20_20_70_70.jpg;0
/home/root/emotion/pics/smile/smile02_20_20_70_70.jpg;0
/home/root/emotion/pics/smile/smile03_20_20_70_70.jpg;
0
/home/root/emotion/pics/surprised/surprised01_20_20_70_70.jpg;1
/home/root/emotion/pics/surprised/surprised02_20_20_70_70.jpg;1
/home/root/emotion/pics/surprised/surprised03_20_20_70_70.jpg;1
/home/root/emotion/pics/serious/serious01_20_20_70_70.jpg;2
/home/root/emotion/pics/serious/serious02_20_20_70_70.jpg;2
/home/root/emotion/pics/serious/serious03_20_20_70_70.jpg;2
The Code for Emotion Classification
FaceRecognizer
, which is responsible for reading your models. In other words, it reads the pictures and each state index in the database and, using a model called fisherface
, feeds (or trains) the model in order to be able to predict emotions./*
* Copyright (c) 2011. Philipp Wagner <bytefish[at]gmx[dot]de>.
* Released to public domain under terms of the BSD Simplified license.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of the organization nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
*
* Manoel Ramon - 06/15/2014
* manoel.ramon@gmail.com
* code changed from original facerec_fisherface.cpp
* added:
* - adaption to emotions detection instead gender
* - picture took from the default video device
* - added face and eyes recognition
* - crop images based in human anatomy
* - prediction based in face recognized
*
*/
#include <opencv2/opencv.hpp>
#include <stdio.h>
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/core/core.hpp"
#include "opencv2/contrib/contrib.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace std;
String face_cascade_name = "haarcascade_frontalface_alt.xml";
String eye_cascade_name = "haarcascade_eye.xml";
Mat faceDetect(Mat img);
CascadeClassifier face_cascade;
CascadeClassifier eyes_cascade;
using namespace cv;
using namespace std;
enum EmotionState_t {
SMILE =0, // 0
SURPRISED, // 1
SERIOUS, // 2
};
static void read_csv(const string&
filename, vector<Mat>&
images, vector<int>&
labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
labels.push_back(atoi(classlabel.c_str()));
}
}
}
int main(int argc, const char *argv[])
{
EmotionState_t emotion;
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc < 2) {
cout << "usage: " << argv[0] << " <csv.ext> <output_folder> " << endl;
exit(1);
}
if( !face_cascade.load( face_cascade_name ) ){ printf("--(!)Error loading\n"); return -1; };
if( !eyes_cascade.load( eye_cascade_name ) ){ printf("--(!)Error loading\n"); return -1; };
// 0 is the ID of the built-in laptop camera, change if you want to use other camera
VideoCapture cap(-1);
//check if the file was opened properly
if(!cap.isOpened())
{
cout << "Capture could not be opened succesfully" << endl;
return -1;
}
else
{
cout << "camera is ok.. Stay 2 ft away from your camera\n" << endl;
}
int w = 432;
int h = 240;
cap.set(CV_CAP_PROP_FRAME_WIDTH, w);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, h);
Mat frame;
cap >>frame;
cout << "processing the image...." << endl;
Mat testSample = faceDetect(frame);
// Get the path to your CSV.
string fn_csv = string(argv[1]);
// These vectors hold the images and corresponding labels.
vector<Mat> images;
vector<int> labels;
// Read in the data. This can fail if no valid
// input filename is given.
try
{
read_csv(fn_csv, images, labels);
} catch (cv::Exception&
e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit(1);
}
// Quit if there are not enough images for this demo.
if(images.size() <= 1)
{
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
}
// Get the height from the first image. We'll need
this
// later in code to reshape the images to their original
// size:
int height = images[0].rows;
// The following lines create an Fisherfaces model for
// face recognition and train it with the images and
// labels read from the given CSV file.
// If you just want to keep 10 Fisherfaces, then call
// the factory method like this:
//
// cv::createFisherFaceRecognizer(10);
//
// However it is not useful to discard Fisherfaces! Please
// always try to use _all_ available Fisherfaces for
// classification.
//
// If you want to create a FaceRecognizer with a
// confidence threshold (e.g. 123.0) and use _all_
// Fisherfaces, then call it with:
//
// cv::createFisherFaceRecognizer(0, 123.0);
//
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
int predictedLabel = model->predict(testSample);
// To get the confidence of a prediction call the model with:
//
// int predictedLabel = -1;
// double confidence = 0.0;
// model->predict(testSample, predictedLabel, confidence);
//
string result_message = format("Predicted class = %d", predictedLabel);
cout << result_message << endl;
// giving the result
switch (predictedLabel)
{
case SMILE:
cout << "You are happy!" << endl;
break;
case SURPRISED:
cout << "You are surprised!" << endl;
break;
case SERIOUS:
cout << "You are serious!" << endl;
break;
}
return 0;
cap.release();
return 0;
}
Mat faceDetect(Mat img)
{
std::vector<Rect> faces;
std::vector<Rect> eyes;
bool two_eyes = false;
bool any_eye_detected = false;
//detecting faces
face_cascade.detectMultiScale( img, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
if (faces.size() == 0)
{
cout << "Try again.. I did not dectected any faces..." << endl;
exit(-1); // abort everything
}
Point p1 = Point(0,0);
for( size_t i = 0; i < faces.size(); i++ )
{
// we cannot draw in the image !!! otherwise will mess with the prediction
// rectangle( img, faces[i], Scalar( 255, 100, 0 ), 4, 8, 0 );
Mat frame_gray;
cvtColor( img, frame_gray, CV_BGR2GRAY );
// croping only the face in region defined by faces[i]
std::vector<Rect> eyes;
Mat faceROI = frame_gray( faces[i] );
//In each face, detect eyes
eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 3, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );
for( size_t j = 0; j < eyes.size(); j++ )
{
Point center( faces[i].x + eyes[j].x + eyes[j].width*0.5, faces[i].y + eyes[j].y + eyes[j].height*0.5 );
// we cannot draw in the image !!! otherwise will mess with the prediction
// int radius = cvRound( (eyes[j].width + eyes[j].height)*0.25 );
// circle( img, center, radius, Scalar( 255, 0, 0 ), 4, 8, 0 );
if (j==0)
{
p1 = center;
any_eye_detected = true;
}
else
{
two_eyes = true;
}
}
}
cout << "SOME DEBUG" << endl;
cout << "-------------------------" << endl;
cout << "faces detected:" << faces.size() << endl;
cout << "x: " << faces[0].x << endl;
cout << "y: " << faces[0].y << endl;
cout << "w: " << faces[0].width << endl;
cout << "h: " << faces[0].height << endl << endl;
Mat imageInRectangle;
imageInRectangle = img(faces[0]);
Size recFaceSize = imageInRectangle.size();
cout << recFaceSize << endl;
// for debug
imwrite("imageInRectangle.jpg", imageInRectangle);
int rec_w = 0;
int rec_h = faces[0].height * 0.64;
// checking the (x,y) for cropped rectangle
// based in human anatomy
int px = 0;
int py = 2 * 0.125 * faces[0].height;
Mat cropImage;
cout << "faces[0].x:" << faces[0].x << endl;
p1.x = p1.x - faces[0].x;
cout << "p1.x:" << p1.x << endl;
if (any_eye_detected)
{
if (two_eyes)
{
cout << "two eyes detected" << endl;
// we have detected two eyes
// we have p1 and p2
// left eye
px = p1.x / 1.35;
}
else
{
// only one eye was found.. need to check if the
// left or right eye
// we have only p1
if (p1.x > recFaceSize.width/2)
{
// right eye
cout << "only right eye detected" << endl;
px = p1.x / 1.75;
}
else
{
// left eye
cout << "only left eye detected" << endl;
px = p1.x / 1.35;
}
}
}
else
{
// no eyes detected but we have a face
px = 25;
py = 25;
rec_w = recFaceSize.width-50;
rec_h = recFaceSize.height-30;
}
rec_w = (faces[0].width - px) * 0.75;
cout << "px :" << px << endl;
cout << "py :" << py << endl;
cout << "rec_w:" << rec_w << endl;
cout << "rec_h:" << rec_h << endl;
cropImage = imageInRectangle(Rect(px, py, rec_w, rec_h));
Size dstImgSize(70,70); // same image size of db
Mat finalSizeImg;
resize(cropImage, finalSizeImg, dstImgSize);
// for debug
imwrite("onlyface.jpg", finalSizeImg);
cvtColor( finalSizeImg, finalSizeImg, CV_BGR2GRAY );
return finalSizeImg;
}
Reviewing opencv_emotion_classification.cpp
enum
matches the emotion index in the CSV file.enum EmotionState_t {
SMILE =0, // 0
SURPRISED, // 1
SERIOUS, // 2
};
main()
function, a variable of type EmotionState_t
is created and it is expected to receive the name of the CSV file as an argument.int main(int argc, const char *argv[])
{
EmotionState_t emotion;
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc < 2) {
cout << "usage: " << argv[0] << " <csv.ext> <output_folder> " << endl;
exit(1);
}
faceDetect()
method changes, compared to the faceDetect()
method shown earlier:Mat testSample = faceDetect(frame);
testSample
contains the cropped face. This cropped image is the same size as the images in the database. This image returned is in grayscale and is cropped like the images shown in Figure 7-12.testSample
image is 70x70. For now, let’s continue with the main()
function. faceDetect()
will be discussed in more detail later.Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
int predictedLabel = model->predict(testSample);
class FaceRecognizer : public Algorithm
FaceRecognizer
looks very simple, but in fact it’s very powerful and complex. This class allows you to set different algorithms, including your own, to perform different kinds of image recognitions.fisherface
and it’s created by the line:Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
void FaceRecognizer::train(InputArrayOfArrays src, InputArray labels)
labels
):model->train(images, labels);
int FaceRecognizer::predict(InputArray src) const = 0
label
) based on the image casted as the input array src
.happy
” is labeled as 0
in the CSV file and the FaceRecognizer
was trained, the prediction will return 0
if the image src
is a picture of you smiling.int predictedLabel = model->predict(testSample);
...
...
...
// giving the result
switch (predictedLabel)
{
case SMILE:
cout << "You are happy!" << endl;
break;
case SURPRISED:
cout << "You are surprised!" << endl;
break;
case SERIOUS:
cout << "You are serious!" << endl;
break;
}
faceDetect()
is cropped properly and your expression is similar to the expression in the database, the algorithm will predict accurately.faceDetect()
method basically does what was done before, as explained in the flowchart in Figure 7-10. In other words, it detects the face and eyes.imageRectangle
). However, these elements are not interesting to the emotion classifier and must be removed (the red arrows area) and only the portion containing eyes, nose, and mouth are cropped (cropImage
).px
and py
coordinates with the extensions rec_w
and rec_h
, which forms a triangle with perfect dimensions for cropping the area. Such a rectangle corresponds to the ROI (Region of Interest) area.px
, px
, rec_w
, and rec_h
values in the image and crop the image.p1
that corresponds to the center of the eye. The point object p1
has two members x
and y
that represent the distance in pixels to the original image. There are a couple of problems, however. Sometimes only one eye is detected and the algorithm must determine if it’s the right or the left. Other times, no eye is detected.//detecting faces
face_cascade.detectMultiScale
( img, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
Point p1 = Point(0,0);
for( size_t i = 0; i < faces.size(); i++ )
{
...
...
...
//
In each face, detect eyes
eyes_cascade.detectMultiScale
( faceROI, eyes, 1.1, 3, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );
for( size_t j = 0; j < eyes.size(); j++ )
{
Point center( faces[i].x + eyes[j].x + eyes[j].width*0.5, faces[i].y + eyes[j].y + eyes[j].height*0.5 );
...
...
...
if (j==0)
{
p1 = center;
any_eye_detected = true;
}
else
{
two_eyes = true;
}
}
}
px
and py
coordinates, as well as the ROI dimensions, rec_w
and rec_h
.int rec_w = 0;
int rec_h = faces[0].height * 0.64;
// checking the (x,y) for cropped rectangle
// based in human anatomy
int px = 0;
int py = 2 * 0.125 * faces[0].height;
Mat cropImage;
cout << "faces[0].x:" << faces[0].x << endl;
p1.x = p1.x - faces[0].x;
cout << "p1.x:" << p1.x << endl;
if (any_eye_detected)
{
if (two_eyes)
{
cout << "two eyes detected" << endl;
// we have detected two eyes
// we have p1 and p2
// left eye
px = p1.x / 1.35;
}
else
{
// only one eye was found.. need to check if the
// left or right eye
// we have only p1
if (p1.x > recFaceSize.width/2)
{
// right eye
cout << "only right eye detected" << endl;
px = p1.x / 1.75;
}
else
{
// left eye
cout << "only left eye detected" << endl;
px = p1.x / 1.35;
}
}
}
else
{
// no eyes detected but we have a face
px = 25;
py = 25;
rec_w = recFaceSize.width-50;
rec_h = recFaceSize.height-30;
}
rec_w = (faces[0].width - px) * 0.75;
cout << "px :" << px << endl;
cout << "py :" << py << endl;
cout << "rec_w:" << rec_w << endl;
cout << "rec_h:" << rec_h << endl;
cropImage = imageInRectangle(Rect(px, py, rec_w, rec_h));
faceDetect()
method saves two images in the file system every time the software runs. One is called onlyface.jpg
and it contains the cropped image. The other is called imageInRectangle.jpg
and it contains the detected image.Mat imageInRectangle;
imageInRectangle = img(faces[0]);
...
...
...
// for debug
imwrite("imageInRectangle.jpg", imageInRectangle);
cropImage = imageInRectangle(Rect(px, py, rec_w, rec_h));
...
...
...
Size dstImgSize(70,70); // same image size of db
Mat finalSizeImg;
resize(cropImage, finalSizeImg, dstImgSize);
Running opencv_emotion_classification.cpp
uvcvideo
driver is loaded and the webcam is connected to the USB port (read the “Connecting the Webcam” section in this chapter), and transfer the program to the same location of your CSV file. Stay in front your camera, preferably two feet away, make some emotional expressions, and then run the following command:root@clanton:∼/emotion#
./opencv_emotion_classification my_csv.csv 2> /dev/null
camera is ok.. Stay 2 ft away from your camera
processing the image....
SOME DEBUG
-------------------------
faces detected:1
x: 172
y: 25
w: 132
h: 132
[132 x 132]
faces[0].x:172
p1.x:-172
px :25
py :25
rec_w:80
rec_h:102
Predicted class = 0
You are happy!
onlyface.jpg
and imageInRectangle.jpg
from the file system, it is possible to observe my expression in the cropped image, shown in Figure 7-14.
root@clanton:∼/emotion#
./opencv_emotion_classification my_csv.csv 2> /dev/null
camera is ok.. Stay 2 ft away from your camera
processing the image....
SOME DEBUG
-------------------------
faces detected:1
x: 178
y: 3
w: 143
h: 143
[143 x 143]
faces[0].x:178
p1.x:43
two eyes detected
px :31
py :35
rec_w:84
rec_h:91
Predicted class = 1
You are surprised!
onlyface.jpg
and imageInRectangle.jpg
from the file system, it observes my expression and crops the image, as shown in Figure 7-15.
Ideas for Improving the Project
Integrating Your Emotions with a Robotic Head
Expanding the Classifications
fisherface
model was used to classify emotions but the same technique can be used to classify gender or recognize your family and friends.http://docs.opencv.org/trunk/modules/contrib/doc/facerec/tutorial/facerec_gender_classification.html
.Improving the Emotion Classification Using Large Databases
http://face-rec.org/databases/
.Improving the Emotion Classification for Several Faces
Summary
fisherface
model.