Skip to main content

2018 | Book

Computer Vision and Graphics

International Conference, ICCVG 2018, Warsaw, Poland, September 17 - 19, 2018, Proceedings

Editors: Leszek J. Chmielewski, Ryszard Kozera, Arkadiusz Orłowski, Dr. Konrad Wojciechowski, Prof. Alfred M. Bruckstein, Nicolai Petkov

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science


About this book

This book constitutes the refereed proceedings of the International Conference on Computer Vision and Graphics, ICCVG 2018, held in Warsaw, Poland, in September 2018. The 45 full papers were selected from 117 submissions. The contributions are thematically arranged as follows: computer graphics, image quality and graphic, user interfaces, object classification and features, 3D and stereo image processing, low-level and middle-level image processing, medical image analysis, motion analysis and tracking, security and protection, pattern recognition and new concepts in classification.

Table of Contents


Computer Graphics, Perception and Image Quality

Hemispherical Gaussians for Accurate Light Integration

In real-time computer graphics, approximations are often used to allow for interactive virtual scene rendering. Concerning the on-line lighting of such scenes, an approach getting increased recognition is to approximate the light in every direction of the hemisphere of a surface point using suitable mathematical distribution functions, such as the well-known Spherical Gaussian. A drawback of this distribution is that current methods using it are inaccurate and do not reflect the correct lighting integral over the surface hemisphere. We show new and more accurate convolution of a Spherical Gaussian with a clamped cosine distribution. In short, we propose a closed form approximation of the hemispherical integral of such a distribution in an arbitrary hemisphere. While our use case is the approximation of the hemispherical lighting situation, we believe that our general formulation of the hemispherical integral of a Spherical Gaussian can also be useful in other areas.

Julian Meder, Beat Brüderlin
Gaze-Dependent Screen Space Ambient Occlusion

The screen space ambient occlusion (SSAO) is a fast global illumination technique, which approximates interreflections between rendered objects. Due to its simplicity, it is often implemented in commercial computer games. However, despite the fact that SSAO calculations take a few milliseconds per frame, a significant computation load is added to the total rendering time. In this work we propose a technique, which accelerates the SSAO calculations using information about observer’s gaze direction captured by the eye tracker. The screen region surrounding the observer’s gaze position is rendered with maximum quality, which is reduced gradually for higher eccentricities. The SSAO quality is varying by changing the number of samples that are used to approximate the SSAO occlusion shadows. The reduced sampling results in almost two-fold acceleration of SSAO with negligible deterioration of the image quality.

Radosław Mantiuk
A Fast Algorithm for Quaternion-Based 4D Rotation

In this work, a fast algorithm for quaternion-based 4D rotation is presented which reduces the number of underlying real multiplications. Performing a quaternion-based rotation using rotation matrix takes 32 multiplications and 60 additions of real numbers while the proposed algorithm can compute the same result in only 16 real multiplications (or multipliers - in hardware implementation case) and 56 additions.

Aleksandr Cariow, Galina Cariowa, Dorota Majorkowska-Mech
A Study on Image Comparison Metrics for Atmospheric Scattering Phenomenon Rendering

Though reference image quality can be calculated with several, well established, comparison methods, images synthesizing light atmospheric scattering phenomenon require adequate evaluation approaches. Current metrics concentrate mainly on noise ratio, entropy or simple pixels correlation coefficients. Thus methods require images strict adequacy in structure and position of their components. On the other hand, light atmospheric scattering renders, synthesized with different methods, should concentrate on their structural representation and possible color gradients rather than direct correspondence of individual pixels. The paper presents a study on image comparison methods in a context of light atmospheric scattering phenomenon. We have focused on several, most popular image comparison metrics like Pearson Correlation Coefficient (PCC), Structural Similarity (SSIM), Multi-Scale Structural Similarity (MS-SSIM) and Perceptual Difference (PD). We compare this metrics in terms of clear sky synthesis problem and try to select the most relevant metrics for the stated phenomenon. The conclusion and discussion provides a handful of suggestions concerning phenomenon related metrics selection.

Tomasz Gałaj, Adam Wojciechowski
Graphical Interface Design for Chatbots for the Needs of Artificial Intelligence Support in Web and Mobile Applications

The interest in the topic of conversational agents has been continuously rising for the past few years, as the technology itself has proved to have multiple practical applications. This paper discusses the design principles for graphical interfaces of conversational agents implemented for the needs of any branch of business that may benefit from the introduction of such solutions, including customer service, healthcare, sales and various types of services. Requirements are defined according to current trends in application design, including the use on mobile devices. The paper presents a survey on solutions fulfilling the mentioned requirements and discusses emerging issues. The paper also describes and proposes a reply scenario model suitable for the needs of implementing a flexible graphical interface for a modern chatbot-based system.

Mateusz Modrzejewski, Przemysław Rokita
Algorithms for Random Maps Generation and Their Implementation as a Python Library

Random map generation has application in strategy computer games, terrain simulators, and other areas. In this paper basic assumptions of a library for random maps generation are presented. It uses both value noise and diamond square computer graphics algorithms, as well as newly invented algorithms for biomes creation and river generation. Complete library implementation with an example use in a separate application is explained in detail. Basic issues related to developing programming libraries and random map generations are also discussed.

Marian Rusek, Rafał Jusiak, Waldemar Karwowski
Modeling and Rendering of Volumetric Clouds in Real-Time with Unreal Engine 4

Simulation of realistic clouds is a difficult task that graphic designers have been trying to achieve for many years. Clouds are necessary for video games and animated movies with outdoor scenes.Unreal Engine 4, currently one of the most advanced game engines, offers various solutions for implementing such effects. In this paper, we present the effective methods for obtaining realistic real-time clouds in Unreal Engine 4 and provide some new extensions to make the result even better. The methods described are suitable for modeling and rendering both a single cloud and the entire sky filled with clouds. Amongst proposed techniques are modeling clouds by using noise functions, modeling by using a simplified simulation of cloud forming phenomena, rendering by using Ray marching technique and using fluid simulation to allow interaction of clouds with other objects on the scene.

Łukasz Nowak, Artur Bąk, Tomasz Czajkowski, Konrad Wojciechowski
Real-Time Simulation of Animated Characters Crowd in Unreal Engine 4

Realistic crowd simulation is still not an easy task in computer animation and computer games domain, especially when it has to be done in real-time and no post-processing correction is possible. The goal of this paper is to analyze and compare the algorithms of crowd behavior and collision avoidance available in Unreal Engine 4, which is one of the most popular game engines. We addressed various methods: natively implemented in the engine, requiring their own implementation and requiring the use of external software. We found one of the natively implemented algorithms, Detour Crowd, as the most useful in terms of quality and flexibility to extend, so as a consequence we use it as a base for our own improvements of engine functionality to achieve more believable movement of actors in crowd simulation. These improvements are also addressed by the paper.

Michał Rosenbeiger, Artur Bąk, Tomasz Czajkowski

Object Classification and Features

Plane Object-Based High-Level Map Representation for SLAM

High-level map representation providing object-based understanding of the environment is an important component for SLAM. We present a novel algorithm to build plane object-based map representation upon point cloud that is obtained in real–time from RGB-D sensors such as Kinect. On the basis of segmented planes in point cloud we construct a graph, where a node and edge represent a plane and its real intersection with other plane, respectively. After that, we extract all trihedral angles (corners) represented by 3rd order cycles in the graph. Afterwards, we execute systematic aggregation of trihedral angles into object such as trihedral angles of the same plane-based object have common edges. Finally, we classify objects using simple subgraph patterns and determine their physical sizes. Our experiments figured out that the proposed algorithm reliably extracts objects, determines their physical sizes and classifies them with a promising performance.

Pavel Gritsenko, Igor Gritsenko, Askar Seidakhmet, Bogdan Kwolek
Level-Set Based Algorithm for Automatic Feature Extraction on 3D Meshes: Application to Crater Detection on Mars

The knowledge of the origin and development of all bodies in the solar system begins with understanding the geologic history and evolution of the universe. The only approach for dating celestial body surfaces is by the analysis of the crater impact density and size. In order to facilitate this process, automatic approaches have been proposed for the impact craters detection. In this article, we propose a novel approach for detecting craters’ rims. The developed method is based on a study of the Digital Elevation Model (DEM) geometry, represented as a 3D triangulated mesh. We use curvature analysis, in combination with a fast local quantization method to automatically detect the craters’ rims with artificial neural network. The validation of the method is performed on Barlow’s database.

Nicole Christoff, Agata Manolova, Laurent Jorda, Sophie Viseur, Sylvain Bouley, Jean-Luc Mari
A System for Automatic Town Sign Recognition for Driver Assistance Systems

The paper describes two different ways of recognizing the road signs that can be applied to the autonomous driver assistance systems. It provides the road sign content and analysis of implemented algorithms, in order to apply them in such systems, resulting in the extension of their functionalities. The mobile application implemented as part of performed experiments, works using the real–time data. The application has been tested practically—the smartphone placed in a car was registering and analyzing the road signs. The paper describes the possibilities of practical use of a mobile device in combination with a real–time data processing program for the detection and recognition of selected road signs. The theoretical part discusses some important automotive topics and selected methods for road signs analysis. As part of practical section, two different methods of road sign recognition, have been implemented and analyzed. The application has been studied and made for iOS system. The OpenCV library has been additionally used. Also, the possibilities of development and optimization of selected algorithms have been shown.

Dariusz Frejlichowski, Piotr Mikołajczak
Selective and Simple Graph Structures for Better Description of Local Point-Based Image Features

The paper presents simple graph features based on a well-known image keypoints. We discuss the extraction method and geometrical properties that can be used. Chosen methods are tested in KNN tasks for almost 1000 object classes. The approach addresses problems in applications that cannot use learning methods explicitly, as real-time tracking, chosen object detection scenarios and structure from motion. Results imply that the idea is worth further research for chosen systems.

Grzegorz Kurzejamski, Marcin Iwanowski
Scene Recognition for Indoor Localization of Mobile Robots Using Deep CNN

In this paper we propose a deep neural network based algorithm for indoor place recognition. It uses transfer learning to retrain VGG-F, a pretrained convolutional neural network to classify places on images acquired by a humanoid robot. The network has been trained as well as evaluated on a dataset consisting of 8000 images, which were recorded in sixteen rooms. The dataset is freely accessed from our website. We demonstrated experimentally that the proposed algorithm considerably outperforms BoW algorithms, which are frequently used in loop-closure. It also outperforms an algorithm in which features extracted by FC-6 layer of the VGG-F are classified by a linear SVM.

Piotr Wozniak, Hadha Afrisal, Rigel Galindo Esparza, Bogdan Kwolek
Character Recognition Based on Skeleton Analysis

Character Recognition is a prominent field of research in pattern recognition. Low error rate of methods presented in other papers indicates that the problem of recognizing typewritten fonts is solved, using mainly deep learning methods. However, those algorithms do not work as well for recognizing handwritten characters, since learning discriminative features is much more complex for this problem so it still remains an interesting issue from research point of view. This document presents a proposal to solve handwritten characters recognition problem using k3m skeletonization algorithm. The idea has been designed to work correctly regardless of the width of the characters, their thickness or shape. This is an innovative method not considered in previous papers, which yields results comparable to the best ones achieved so far, what is proven in tests. The method can be also easily extended to signs other than glyphs in Latin alphabet.

Kacper Sarnacki, Khalid Saeed
Weather Characterization from Outdoor Scene Images

This paper presents a variety of techniques to characterize outdoor scene images into their respective weather components; the sky condition (sunny or cloudy), the presence of rain, and the visibility. Weather characterization is not an easy task due to a large variety of outdoor scene images. Each weather component is characterized individually. The results obtained are promising. The accuracy rates for the weather components achieved are; the sky condition attained 80%, the presence of rain attained 60%, and visibility achieved a mean squared error (MSE) of 1.41.

Jenade Moodley, Serestina Viriri

3D and Stereo Image Processing

Clustering Quality Measures for Point Cloud Segmentation Tasks

This paper presents improved weighted measures for a point cloud segmentation quality evaluation. They provide more reliable and intuitive appraisal as well as more representative classification characteristics. The new measures are compared with the existing ones: based on classification, and based on information theory. The experiments and measures evaluation were performed for the recently outstanding fresh planes segmentation method. Experiments results showed that newly elaborated measures provide a researcher with distinguished information about segmentation output. This paper introduces recommendations for quality measures adjustment to a particular planar fragments detection problem, what implies contributions for effective development of such methods.

Jakub Walczak, Adam Wojciechowski
Multi-camera Photometric Simulation for Creation of 3D Object Reconstruction System

Photogrammetry allows a three-dimensional reconstruction of the object based on its multiple photographies. The quality of the reconstruction result depends mostly on the gloss, the diversity of the texture, the lighting conditions, the quality of the camera calibration and the shape of the object. The article presents the results of a simulation of a multi-camera reconstruction system, for the needs of developing a 3D objects reconstruction system (3D scanner). The 3D reconstruction system works by simultaneously taking photographs of cameras located around the object. The simulation was created to investigate the optimal distribution of cameras and projectors casting a pattern that increases the number of characteristic points on the surface of the object. The impact of background removal in images on the reconstruction result as well as the texture quality of the object depending on the resolution and distance of the cameras from the object were also investigated. The graphic engine used to create the simulation also allows testing of various types of object lighting. The presented results prove that the parameters of the system structure, such as the placement of cameras, projectors, the selection of patterns projected by the projectors are important and their values can be determined at the stage of system simulation. Conceptual errors at the simulation stage can be removed with minimal cost and the actual system can be created on the basis of tested assumptions. The conducted research in real-world conditions of the designed 3D object reconstruction system based on simulated parameters confirms the validity of the use of simulation.

Dawid Sobel, Karol Jedrasiak, Aleksander Nawrat
Quality Evaluation of 3D Printed Surfaces Based on HOG Features

The main purpose of the visual quality assessment of 3D prints is the detection of surface distortions which can be made using various approaches. Nevertheless, a reliable classification of 3D printed samples into low and high quality ones can be troublesome, especially assuming the unknown color of the filament. Such a classification can be efficiently conducted using the approach based on the Histogram of Oriented Gradients (HOG) proposed in this paper. Obtained results are very promising and allow proper classification for the most of the tested samples, especially for some of the most typical distortions.

Piotr Lech, Jarosław Fastowicz, Krzysztof Okarma
Convolutional Neural Network-Based Action Recognition on Depth Maps

In this paper, we present an algorithm for action recognition that uses only depth maps. We propose a set of handcrafted features to describe person’s shape in noisy depth maps. We extract features by a convolutional neural network (CNN), which has been trained on multi-channel input sequences consisting of two consecutive depth maps and depth map projected onto an orthogonal Cartesian plane. We show experimentally that combining features extracted by the CNN and proposed features leads to better classification performance. We demonstrate that an LSTM trained on such aggregated features achieves state-of-the-art classification performance on UTKinect dataset. We propose a global statistical descriptor of temporal features. We show experimentally that such a descriptor has high discriminative power on time-series of concatenated CNN features with handcrafted features.

Jacek Trelinski, Bogdan Kwolek
An Integrated Procedure for Calibrating and Distortion Correction of the Structure Sensor and Stereo-Vision Depth Sensors

The paper presents a calibration procedure of a 3D scene reconstruction system consisting of an active depth sensor (Structure Sensor) and a stereo camera with a wide view angle lenses. The wide angle lenses with large radial distortions used in the stereoscopic part of the system require application of a fish-eye model for correcting geometric distortions while for the infrared camera of the Structure Sensor a traditional pinhole model is sufficient. Calibration of the system comprises also a procedure for correcting depth distortions introduced by the Structure Sensor device. A simple yet efficient method for calibrating the cameras using functions provided by OpenCV library is proposed. The system is a part of a device helping visually impaired people to navigate in the environment.

Dariusz Rzeszotarski, Pawel Strumillo
Second-Order Algebraic Surfaces and Two Image Photometric Stereo

This paper discusses the special case of reconstructing the unknown Lambertian surface from two-image photometric stereo. Both images are assumed here to be formed by a genuine second-order algebraic surface. The corresponding uniqueness issue is discussed for different pairs of image irradiance equations under various illumination settings. Illustrative examples supplement presented analysis and demonstrate its main gist.

Ryszard Kozera, Alexander Prokopenya

Low-Level and Middle-Level Image Processing

Improving RGB Descriptors Using Depth Cues

Geometrical distortions are tackled in different way in multiple keypoint detection and feature extraction algorithms. However, those are implemented as an integral part of the solution, making it impossible to use the same distortion removal method in other solutions. To the best of authors knowledge, there are no universal methods of distortion removal, that can be used as an intermediate step, between keypoint detection and feature extraction. Creating that kind of algorithm, instead of development of yet another ‘robust descriptor’, will enable seamless integration in existing applications, and, possibly, will increase object recognition success rate, independent of the selected keypoint detector/descriptor pair.

Maciej Stefańczyk
Embedding Spatial Context into Spectral Angle Based Nonlinear Mapping for Hyperspectral Image Analysis

Due to the high dimensionality and redundancy of hyperspectral images, an important step in analyzing such images is to reduce the dimensionality. In this paper, we propose and study the dimensionality reduction technique, which is based on the approximation of spectral angle mapper (SAM) measures by Euclidean distances. The key feature of the proposed method is the integration of spatial information into the dissimilarity measure. The experiments performed on the open hyperspectral datasets showed that the developed method can be used in the analysis of hyperspectral images.

Evgeny Myasnikov
Color Object Retrieval Using Local Features Based on Opponent-Process Theory

Although the color is perceived as an irreplaceable element describing the world around us, the techniques for extracting of the local features are mostly based on the description of the intensities - while the color information is being fully ignored. This paper proposes a method for extracting of the local features of the color image. As a basic model we have chosen the approach to the human visual system using chromatic opponent channels and the SIFT (Scalable Invariant Feature Transform) method. The idea of this solution is the incorporation of the opponent chromatic channels by replacing the grayscale information in the SIFT method, so that the key points are detected on two separate opponent channels. For the interesting points found in the two channels, the descriptors are formed which are then united into one set. We also propose the new methods for the validation of the keypoint pairing utilizing the keypoint orientation consistency check. The algorithm was tested in an object retrieval experiment.

Paula Budzakova, Elena Sikudova, Zuzana Berger Haladova
Extracting Textual Overlays from Social Media Videos Using Neural Networks

Textual overlays are often used in social media videos as people who watch them without the sound would otherwise miss essential information conveyed in the audio stream. This is why extraction of those overlays can serve as an important meta-data source, e.g. for content classification or retrieval tasks. In this work, we present a robust method for extracting textual overlays from videos that builds up on multiple neural network architectures. The proposed solution relies on several processing steps: keyframe extraction, text detection and text recognition. The main component of our system, i.e. the text recognition module, is inspired by a convolutional recurrent neural network architecture and we improve its performance using synthetically generated dataset of over 600,000 images with text prepared by authors specifically for this task. We also develop a filtering method that reduces the amount of overlapping text phrases using Levenshtein distance and further boosts system’s performance. The final accuracy of our solution reaches over 80% and is au pair with state-of-the-art methods.

Adam Słucki, Tomasz Trzciński, Adam Bielski, Paweł Cyrta
Choosing an Optimal Bracketing Sequence for HDR Imaging

In this paper we present a new concept of selecting a subset of significant images from a large sequence of multi exposure bracketing photographs, used for HDR imaging. Presented algorithm picks up several frames from a sequence taken with increasing exposure and use them to generate high dynamic range image. The choice allows us to obtain the photos with important data and to discard the images with redundant or deformed visual information, like overburned or dark areas. We have tested the presented technique by selecting $$n=7$$ n = 7 significant frames from the bracketing sequence of 49 elements. The chosen photos have been used to compose an HDR image which contains a maximum insight of the scene with a minimal presence of luminous deformations.

Paweł J. Łubniewski, Wojciech S. Mokrzycki
Detection of Pollen Grains in Digital Microscopy Images by Means of Modified Histogram Thresholding

The paper describes and investigates the application of the algorithm for the detection and extraction of pollen contour shapes in digital microscopic images. This is the first step in the process of identification of pollen grains in order to obtain a method for automatic or semi-automatic analysis of air samples. The final approach is supposed to support this process by recognizing pollen types in digital microscopic images. The applied segmentation approach is based on the Modified Histogram Thresholding, previously employed in the extraction of red blood cells for the automatic diagnosis of certain diseases based on the erythrocyte shapes.

Dariusz Frejlichowski

Medical Image Analysis

U-CatcHCC: An Accurate HCC Detector in Hepatic DCE-MRI Sequences Based on an U-Net Framework

This paper presents a novel framework devoted to the detection of HCC (Hepato-Cellular Carcinoma) within hepatic DCE-MRI (Dynamic Contrast-Enhanced MRI) sequences, by a deep learning approach. In clinical routine, radiologists usually consider different phases during contrast injection (before injection; arterial phase; portal phase for instance) for HCC diagnosis. By employing a U-Net architecture, we are able to identify such tumors with a very high accuracy (98.5% of classification rate at best) for a small cohort of patients, which should be confirmed in future works by considering larger groups. We also show in this paper the influence of patch size for this machine learning process, and the positive impact of employing all phases available in DCE-MRI sequences, compared to use only one.

Anna Fabijańska, Antoine Vacavant, Marie-Ange Lebre, Ana L. M. Pavan, Diana R. de Pina, Armand Abergel, Pascal Chabrot, Benoît Magnin
Unsupervised Caries Detection in Non-standardized Periapical Dental X-Rays

Dental caries are currently one of the most prevalent diseases in the modern world. Early detection and diagnosis of the disease is the best treatment available to dental healthcare professionals and is crucial in preventing advanced stages of decay. This paper presents an effective model for caries detection across a variety of non-uniform X-rays using individual tooth segmentation, boundary detection and caries detection through image analysis techniques. The tooth segmentation is implemented using integral projection and an analytical division algorithm. The boundary detection is implemented through the use of top and bottom hat transformations and active contours. Finally the caries detection was achieved through the use of blob detection and cluster analysis on suspected carious regions. The cluster analysis generates its results relative to the image being analyzed and as such, forms the unsupervised evaluation approach of this paper. The viability of this unsupervised learning model, and its relative effectiveness of accurately diagnosing dental caries when compared to current systems, is indicated by the results detailed in this paper, with the proposed model achieving a 96% correct diagnostic.

Darren Osterloh, Serestina Viriri
Localizing Characteristic Points on a Vertebra Contour by Using Shape Language

In this paper, X-ray images are analysed by using the shape language. The algorithm combines syntactic and geometric approach. The geometric features of the contour are described by using syntactic approach. The points on the contour, where pathological changes can occur are localised effectively by the algorithm.

Marzena Bielecka, Andrzej Bielecki
Lytic Region Recognition in Hip Radiograms by Means of Statistical Dominance Transform

Total hip replacement is the accepted treatment procedure of the end stage degeneration of the hip joint. Instability of the prosthesis might be recognized on the radiographic images as area of bone radio - lucency adjacent to the prosthesis pin. However, the very important issue of radiological recognition of periprosthetic lucent areas reflecting the lysis remains a challenge. Small dimensions and fuzzy borders of the lytic areas makes them difficult regions to recognize. Additional factors as high BMI of the patients and/or radiograms taken through a mattress can make the evaluation even more difficult, while small lucent areas might be additionally blurred and of very low contrast. The paper presents a new approach for quantitative recognition of preprothetic lytic areas. We have proposed a multistep algorithm utilizing Statistical Dominance Transform for detection of lytic areas on digital radiograms. Preliminary results are quite promising. It was demonstrated that location and shape of the detected lytic region is in good agreement with assessment by radiologists.

Marcin Kociołek, Adam Piórkowski, Rafał Obuchowicz, Paweł Kamiński, Michał Strzelecki

Motion Analysis and Tracking

Aggregation of Binary Feature Descriptors for Compact Scene Model Representation in Large Scale Structure-from-Motion Applications

In this paper we present an efficient method for aggregating binary feature descriptors to allow compact representation of 3D scene model in incremental structure-from-motion and SLAM applications. All feature descriptors linked with one 3D scene point or landmark are represented by a single low-dimensional real-valued vector called a prototype. The method allows significant reduction of memory required to store and process feature descriptors in large-scale structure-from-motion applications. An efficient approximate nearest neighbours search methods suited for real-valued descriptors, such as FLANN [19], can be used on the resulting prototypes to speed up matching processed frames.

Jacek Komorowski, Tomasz Trzciński
Shallow Convolutional Neural Network and Viterbi Algorithm for Dim Line Tracking

The estimation of line is important in numerous practical applications. The most difficult case if the line is dim, even hidden in background noise. The application of Track–Before–Detect algorithms allows the tracking of such line. Additional preprocessing using shallow neural network trained for the detection of line features is proposed in this paper. Four variant of data fusion from neural network are compared. Direct output of neural network that works as a classifier gives best results for Mean Absolute Error (MAE) metric. Similar results are obtained if output of neural network is used as a mask for input image. Monte Carlo test are used for unbiased results. Test shows improvement of MAE about two times. The application of binary output from neural network is wrong solution and the error is largest. The influence of the number of convolutional layer neurons is not significant in this test.

Przemyslaw Mazurek
Fast-Tracking Application for Traffic Signs Recognition

Traffic sign recognition is among the major tasks on driver assistance system. The convolutional neural networks (CNN) play an important role to find a good accuracy of traffic sign recognition in order to limit the dangerous acts of the driver and to respect the road laws. The accuracy of the Detection and Classification determines how powerful of the technique used is. Whereas SSD Multibox (Single Shot MultiBox Detector) is an approach based on convolutional neural networks paradigm, it is adopted in this paper, firstly because we can rely on it for the real-time applications, this approach runs on 59 FPS (frame per second). Secondly, in order to optimize difficulties in multiple layers of DeeperCNN to provide a finer accuracy. Moreover, our experiment on German traffic sign recognition benchmark (GTSRB) demonstrated that the proposed approach could achieve competitive results (83.2% in 140.000 learning steps) using GPU parallel system and Tensorflow.

Abderrahmane Adoui El Ouadrhiri, Jaroslav Burian, Said Jai Andaloussi, Rachida El Morabet, Ouail Ouchetto, Abderrahim Sekkaki
Embedded Vision System for Automated Drone Landing Site Detection

This paper presents an embedded video subsystem used to classify the terrain, based on an image from a camera located under the drone, for the purpose of an automatic landing system. Colour and texture features, as well as decision trees and support vector machine classifiers were analysed and evaluated. The algorithm was supported with a shadow detection module. It was evaluated on 100 test cases and achieved over 80% performance. The designed video system was implemented on two embedded platforms – a Zynq SoC (System on Chip – Field Programmable Gate Array + ARM processor system) and a Jetson GPU (Graphic Processing Unit + ARM processor system). The performance achieved on both architectures is compared and discussed.

Patryk Fraczek, Andre Mora, Tomasz Kryjak

Human Face, Gestures and Action Analysis

Silhouette-Based Action Recognition Using Simple Shape Descriptors

This paper presents human action recognition method based on silhouette sequences and simple shape descriptors. The proposed solution uses single scalar shape measures to represent each silhouette from an action sequence. Scalars are then combined into a vector that represents the entire sequence. In the following step, vectors are transformed into sequence representations and matched with the use of leave-one-out cross-validation technique and selected similarity or dissimilarity measure. Additionally, action sequences are pre-classified using the information about centroid trajectory into two subgroups—actions that are performed in place and actions during which a person moves in the frame. The average percentage accuracy is 80%—the result is very satisfactory taking into consideration the very small amount of data used. The paper provides information on the approach, some key definitions as well as experimental results.

Katarzyna Gościewska, Dariusz Frejlichowski
Landmark-Based Re-topology of Stereo-Pair Acquired Face Meshes

Believable, detailed digital faces are desired for both film and game production as well as research applications. A high-resolution mesh of a face is acquired often with structured light and stereo computer-vision methods. In production for visual effects or games, the acquired mesh is “re-topologized” by an artist using software to draw a new edge layout in order to simplify and prepare a mesh for animation. For building models of facial population and expression for research, meshes often undergo a similar process to find correspondence among face models, although in the past the desired layout has not been designed for effective facial animation. We propose a process using images accompanying a 3D scan to locate dense 2D facial landmarks automatically to re-topologize meshes with the particular goal of creating models that deform well for facial animation. The technique also allows real-time editing of the re-topology relationship if modifications are desired.

Eric Patterson, Jessica Baron, Devin Simpson
A Kinematic Gesture Representation Based on Shape Difference VLAD for Sign Language Recognition

Automatic Sign language recognition (SLR) is a fundamental task to help with inclusion of deaf community in society, facilitating, noways, many conventional multimedia interactions. In this work is proposed a novel approach to represent gestures in SLR as a shape difference-VLAD mid level coding of kinematic primitives, captured along videos. This representation capture local salient motions together with regional dominant patterns developed by articulators along utterances. Also, the special VLAD representation allows to quantify local motion pattern but also capture shape of motion descriptors, that achieved a proper regional gesture characterization. The proposed approach achieved an average accuracy of 85,45% in a corpus data of 64 sign words captured in 3200 videos. Additionally, for Boston sign dataset the proposed approach achieve competitive results with $$82\%$$ 82 % of accuracy in average.

Jefferson Rodríguez, Fabio Martínez

Security and Protection

Camera Sensor Traces Analysis in Image Forgery Detection Problem

One of the most frequently used types of image forgery is embedding another image fragment in some part of the image. In this article a methods for this type of forgeries detection is proposed. The method is based on the analysis of traces introduced by the camera sensor used to obtain an image. The analyzed image is divided into blocks, for each block we calculate a criterion valued determining the probability of presence/absence of CFA artifacts and, as a consequence, the probability of whether the block is a forgery is calculated. In the experimental part of the work, the accuracy of the detection of the embedded regions is analyzed. We also analyze the robustness of the proposed algorithm to various types of distortions: additive Gaussian noise, JPEG compression and linear contrast enhancement. The results of the experiments showed that the method makes it possible to detect embedded regions of various nature, shape and size, and is also robust to additive Gaussian noise and linear contrast enhancement for a given range of distortions parameters, but is not robust to JPEG compression. A distinctive feature of the method is the ability to identify embedded regions with a minimum size of $$2\times $$ 2 × 2 pixels.

Andrey Kuznetsov
Barcoding in Biometrics and Its Development

The problems of using barcoding in biometrics and its applications are discussed. An overview of the existing solutions to this problem is presented for the different biometric modalities: face, iris, fingerprints, DNA, voice, etc. It is shown that the factor limiting the use of barcoding in biometrics was the low capacity of barcodes. It is noted that at present this problem is being solved within the framework of color barcodes. In this case, the problems of barcoding in biometrics and its applications can be solved in a new way, and examples of these solutions are given in the article.

Nazym Kaziyeva, Georgy Kukharev, Yuri Matveev
Digital Image Forensics Technique for Copy-Move Forgery Detection Using DoG and ORB

Copy–Move forgery or Cloning is image tampering or alteration by copying one area in an image and pasting it into another area of the same image. Due to the availability of powerful image editing software, the process of malicious manipulation, editing and creating fake images has been tremendously simple. Thus, there is a need of robust PBIF (Passive–Blind Image Forensics) techniques to validate the authenticity of digital images. In this paper, CMFD (Copy–Move Forgery Detection) using DoG (Difference of Gaussian) blob detector to detect regions in image, with rotation invariant and resistant to noise feature detection technique called ORB (Oriented Fast and Rotated Brief) is implemented, evaluated on different standard datasets and experimental results are presented.

Patrick Niyishaka, Chakravarthy Bhagvati

Pattern Recognition and New Concepts in Classification

Does the Research Question Structure Impact the Attention Model? User Study Experiment

The main purpose of this work is to analyse the influence of questions given in perceptual experiments on the reliability of received results. The problem have been investigated in literature, however in a form of theoretical discussion rather than quantitative evaluation. Therefore in the paper in include quantitative evaluation and results visualization. Nevertheless the main novelty is employment of human cognitive style for research questions problem. The analysis was supported with FRIS $$\textregistered $$ ® [7] that is a psychometric model and also an inventory for cognitive styles measurement. We conduct the user study in order to investigate the way the observers understand research question given during the experiment execution. Therefore two kinds of questions were employed to the experiment: the one not suggesting a response or the one suggesting an answer. In researches we focused on the images’ objects that attach observer’s attention dependent on a structure of a given question. The obtained results were submitted to statistical analysis that proved that correctness of experimental results is highly correlated to the way the research question is designed. The questions that suggesting the answer affects the outcome of the perceptual experiment and make them unreliable. The results confirmed that the human cognitive style is highly correlated with the research questions interpretation.

Malwina Dziśko, Anna Lewandowska, Anna Samborska-Owczarek
Pattern Recognition Method for Classification of Agricultural Scientific Papers in Polish

Calculation of text similarity is an essential task for the text analysis and classification. It be can based, e.g., on Jaccard, cosine or other similar measures. Such measures consider the text as a bag-of-words and, therefore, lose some syntactic and semantic features of its sentences. This article presents a different measure based on the so-called artificial sentence pattern (ASP) method. This method has been developed to analyze texts in the Polish language which has very rich inflection. Therefore, ASP has utilized syntactic and semantic rules of the Polish language. Nevertheless, we argue that it admits extensions to other languages. As a result of the analysis, we have obtained several hypernodes which contain the most important words. Each hypernode corresponds to one of the examined documents, the latter being published papers from agriculture domain written in Polish. Experimental results obtained from that set of papers have been described and discussed. Those results have been visually illustrated using graphs of hypernodes and compared with Jaccard and cosine measures.

Piotr Wrzeciono, Waldemar Karwowski
Pattern Recognition in Financial Data Using Association Rule

The paper is devoted to study patterns between the world’s financial markets. The classical Association Rules method was adopted to study the relations between time series of stock market indices. One revealed the comovement patterns are predominant over the anti comovement ones. The strength of the relations depends on the distance between markets. One extracted the strongest patterns what allowed to distinguishing the groups of financial markets. The strongest links between Polish and other stock markets were discovered.

Krzysztof Karpio, Piotr Łukasiewicz
Ulam Spiral and Prime-Rich Polynomials

The set of prime numbers visualized as Ulam spiral was considered from the image processing perspective. Sequences of primes forming line segments were detected with the special version of the Hough transform. The polynomials which generate the numbers related to these sequences were investigated for their potential richness in prime numbers. One of the polynomials which generates the numbers forming the 11-point sequence was found exceptionally prime-rich, although it was not the longest sequence found. This polynomial is $$4 n^2 - 1260 n + 98827$$ 4 n 2 - 1260 n + 98827 and it generates 613 primes (20 of them with the minus sign) for the first 1000 non-negative integers as arguments. This is more than generated by some other well-known prime-rich polynomials, the Euler one included.

Arkadiusz Orłowski, Leszek J. Chmielewski
Computer Vision and Graphics
Leszek J. Chmielewski
Ryszard Kozera
Arkadiusz Orłowski
Dr. Konrad Wojciechowski
Prof. Alfred M. Bruckstein
Nicolai Petkov
Copyright Year
Electronic ISBN
Print ISBN

Premium Partner