Skip to main content

2014 | Buch

Graphics Recognition. Current Trends and Challenges

10th International Workshop, GREC 2013, Bethlehem, PA, USA, August 20-21, 2013, Revised Selected Papers

herausgegeben von: Bart Lamiroy, Jean-Marc Ogier

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed post-conference proceedings of the 10th International Workshop on Graphics Recognition, GREC 2013, held in Bethlehem, PA, USA, in August 2013.

The 20 revised full papers presented were carefully reviewed and selected from 32 initial submissions. Graphics recognition is a subfield of document image analysis that deals with graphical entities in engineering drawings, sketches, maps, architectural plans, musical scores, mathematical notation, tables, and diagrams. Accordingly the conference papers are organized in 5 topical sessions on symbol spotting and retrieval, graphics recognition in context, structural and perceptual based approaches, low level processing, and performance evaluation and ground truthing.

Inhaltsverzeichnis

Frontmatter

Symbol Spotting and Retrieval

Frontmatter
Spotting Graphical Symbols in Camera-Acquired Documents in Real Time
Abstract
In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
Marçal Rusiñol, Dimosthenis Karatzas, Josep Lladós
A Product Graph Based Method for Dual Subgraph Matching Applied to Symbol Spotting
Abstract
Product graph has been shown as a way for matching subgraphs. This paper reports the extension of the product graph methodology for subgraph matching applied to symbol spotting in graphical documents. Here we focus on the two major limitations of the previous version of the algorithm: (1) spurious nodes and edges in the graph representation and (2) inefficient node and edge attributes. To deal with noisy information of vectorized graphical documents, we consider a dual edge graph representation on the original graph representing the graphical information and the product graph is computed between the dual edge graphs of the pattern graph and the target graph. The dual edge graph with redundant edges is helpful for efficient and tolerating encoding of the structural information of the graphical documents. The adjacency matrix of the product graph locates the pair of similar edges of two operand graphs and exponentiating the adjacency matrix finds similar random walks of greater lengths. Nodes joining similar random walks between two graphs are found by combining different weighted exponentials of adjacency matrices. An experimental investigation reveals that the recall obtained by this approach is quite encouraging.
Anjan Dutta, Josep Lladós, Horst Bunke, Umapada Pal
Hierarchical Plausibility-Graphs for Symbol Spotting in Graphical Documents
Abstract
Graph representation of graphical documents often suffers from noise such as spurious nodes and edges, and their discontinuity. In general these errors occur during the low-level image processing viz. binarization, skeletonization, vectorization etc. Hierarchical graph representation is a nice and efficient way to solve this kind of problem by hierarchically merging node-node and node-edge depending on the distance. But the creation of hierarchical graph representing the graphical information often uses hard thresholds on the distance to create the hierarchical nodes (next state) of the lower nodes (or states) of a graph. As a result, the representation often loses useful information. This paper introduces plausibilities to the nodes of hierarchical graph as a function of distance and proposes a modified algorithm for matching subgraphs of the hierarchical graphs. The plausibility-annotated nodes help to improve the performance of the matching algorithm on two hierarchical structures. To show the potential of this approach, we conduct an experiment with the SESYD dataset.
Klaus Broelemann, Anjan Dutta, Xiaoyi Jiang, Josep Lladós
Towards Searchable Line Drawings, a Content-Based Symbol Retrieval Approach with Variable Query Complexity
Abstract
Current symbol spotting and retrieval methods are not yet able to achieve the goal of both high accuracy and efficiency on large databases of line drawings. This paper presents an approach for focused symbol retrieval as step towards achieving such a goal by using concepts from image retrieval. During the off-line learning phase of the proposed approach, regions of interest are extracted from the drawings based on feature grouping. The regions are then described using an off-the-shelf descriptor. The similar descriptors are clustered, and finally a visual symbol vocabulary is learned by an SVM classifier. The vocabulary is constructed assuming no knowledge of the contents of the drawings. During on-line retrieval, the classifier recognizes the descriptors of query regions. A query can be a partial or a complete symbol, can contain contextual noise around a symbol or more than one symbol. Experimental results are presented for a database of architectural floor plans.
Nibal Nayef, Wonmin Byeon, Thomas M. Breuel

Graphics Recognition in Context

Frontmatter
Adaptive Contour Classification of Comics Speech Balloons
Abstract
Comic books digitization combined with subsequent comic book understanding give rise to a variety of new applications, including content reflowing, mobile reading and multi-modal search. Document understanding in this domain is challenging as comics are semi-structured documents, with semantic information shared between the graphical and textual parts. Speech balloon contour analysis reveals the speech tone which is an essential step towards a fully automatic comics understanding. In this paper we present the first approach for classifying speech balloon in scanned comic books where we separate and analyze their contour variations to classify them as “smooth” (normal speech), “wavy” (thought) or “zigzag” (exclamation). The experiments show a global accuracy classification of 85.2 % on a wide variety of balloons from the eBDtheque dataset.
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie, Jean-Marc Ogier
Modified Weighted Direction Index Histogram Method for Schema Recognition
Abstract
Recently, many clinical documents have been computerized because of diffusion of Hospital Information Systems (HIS). On the other hand, a large amount of paper-based documents are not used effectively, and these are now still archived as paper documents in hospitals. The authors proposed document image recognition methods for medical/clinical document retrieval. We also discussed the recognition method for schema (medical line drawing) images in the document, because these had key information for document retrieval. However, annotations added to the schema made the feature vector change drastically, as a result the recognition accuracy was reduced. This paper discussed a schema recognition method considering annotations. Actual schema images used in the hospital were employed as experimental materials. We confirmed that the recognition accuracy of the proposed method was improved to 98.52 %.
Hiroshi Kajiwara, Hiroharu Kawanaka, Koji Yamamoto, Haruhiko Takase, Shinji Tsuruoka

Structural and Perceptual Based Approaches, Grouping

Frontmatter
An Algorithm for Grouping Lines Which Converge to Vanishing Points in Perspective Sketches of Polyhedra
Abstract
We seek to detect the vanishing points implied by design sketches of engineering products. Adapting previous approaches, developed in computer vision for analysis of vectorised photographic images, is unsatisfactory, as they do not allow for the inherent imperfection of sketches. Human perception seems not to be disturbed by such imperfections. Hence, we have designed and implemented a vanishing point detection algorithm which mimics the human perception process and tested it with perspective line drawings derived from engineering sketches of polyhedral objects. The new algorithm is fast, easily-implemented, returns the approximate locations of the main vanishing points and identifies those groups of lines in 2D which correspond to groups of parallel edges in the 3D object.
Pedro Company, Peter A. C. Varley, Raquel Plumed
Visual Saliency and Terminology Extraction for Document Classification
Abstract
The document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify documents using a saliency-based segmentation process on one hand, and a terminology extraction and annotation on the other hand. The saliency-based segmentation is used to extract salient regions and by the way logo, while the terminology approach is used to annotate them and to automatically classify the document. The approach does not require human expertise, and use Google Images as a knowledge database. The results obtained on a real database of 1766 documents show the relevance of the approach.
Duthil Benjamin, Coustaty Mickael, Courboulay Vincent, Jean-Marc Ogier
Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies
Abstract
In this paper we present a wall segmentation approach in floor plans that is able to work independently to the graphical notation, does not need any pre-annotated data for learning, and is able to segment multiple-shaped walls such as beams and curved-walls. This method results from the combination of the wall segmentation approaches [3, 5] presented recently by the authors. Firstly, potential straight wall segments are extracted in an unsupervised way similar to [3], but restricting even more the wall candidates considered in the original approach. Then, based on [5], these segments are used to learn the texture pattern of walls and spot the lost instances. The presented combination of both methods has been tested on 4 available datasets with different notations and compared qualitatively and quantitatively to the state-of-the-art applied on these collections. Additionally, some qualitative results on floor plans directly downloaded from the Internet are reported in the paper. The overall performance of the method demonstrates either its adaptability to different wall notations and shapes, and to document qualities and resolutions.
Lluís-Pere de las Heras, Ernest Valveny, Gemma Sánchez
Detecting Recurring Deformable Objects: An Approximate Graph Matching Method for Detecting Characters in Comics Books
Abstract
Graphs are popular data structures used to model pair wise relations between elements from a given collection. In image processing, adjacency graphs are often used to represent the relations between segmented regions. The comparison of such graphs has been largely studied but graph matching strategies are essential to find, efficiently, similar patterns. In this paper, we propose a method to detect the recurring characters in comics books. We would like to draw attention of the reader. In this paper, the term “character” means the protagonists of the story. In our approach, each panel is represented with an attributed adjacency graph. Then, an inexact graph matching strategy is applied to find recurring structures among this set of graphs. The main idea is that the same character will be represented by similar subgraphs in the different panels where it appears. The two-step matching process consists in a node matching step and an edge validation step. Experiments show that our approach is able to detect recurring structures in the graph and consequently the recurrent characters in a comics book. The originality of our approach is that no prior object model is required the characters. The algorithm detects, automatically, all recurring structures corresponding to the main characters of the story.
Hoang Nam Ho, Christophe Rigaud, Jean-Christophe Burie, Jean-Marc Ogier
Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans
Abstract
This paper proposes a runlength histogram signature as a perceptual descriptor of architectural plans in a retrieval scenario. The style of an architectural drawing is characterized by the perception of lines, shapes and texture. Such visual stimuli are the basis for defining semantic concepts as space properties, symmetry, density, etc. We propose runlength histograms extracted in vertical, horizontal and diagonal directions as a characterization of line and space properties in floorplans, so it can be roughly associated to a description of walls and room structure. A retrieval application illustrates the performance of the proposed approach, where given a plan as a query, similar ones are obtained from a database. A ground truth based on human observation has been constructed to validate the hypothesis. Additional retrieval results on sketched building’s facades are reported qualitatively in this paper. Its good description and its adaptability to two different sketch drawings despite its simplicity shows the interest of the proposed approach and opens a challenging research line in graphics recognition.
Lluís-Pere de las Heras, David Fernández, Alicia Fornés, Ernest Valveny, Gemma Sánchez, Josep Lladós

Low Level Processing

Frontmatter
A Stitching Method for Large Document Images
Abstract
In this paper, we are interested in stitching specific types of images such as schemes, cartographies, documents or drawings that have been acquired using a scanner. Because of the size of these documents, it is not possible to make one acquisition even using large scanners. The result of the acquisition is then an image mosaic that needs to be stitched to obtain the entire image. For that purpose, we propose an adaptation of feature based methods that are not directly usable with the images we want to process. Indeed, points of interest (POIs) extraction on the entire image requires too much memory and matching are not always pertinent because of the particularity of these documents. To demonstrate the good performance of our proposition, we present quantitative and qualitative results obtained using two datasets: a set of images divided synthetically and a set of images that have been acquired manually using a scanner.
Ludovic Paulhac, Jean-Philippe Domenger
Filtering Out Readers’ Underline in Monochromatic and Color Documents
Abstract
Text “underlining” is a practice of many interested readers, but it may be seen as a noise inserted by the user that damages the physical integrity of a document. This paper presents two different algorithms for underline removal. The first one addresses the case of monochromatic document images. The second algorithm is applied to remove the underline noise in recent and aged documents written on a blank sheet of white paper. Underline information is also used to automatically generate summaries of documents.
Ricardo da Silva Barboza, Rafael Dueire Lins, Luiz W. Nagata Balduino
Binarization with the Local Otsu Filter
Integral Histograms for Document Image Analysis
Abstract
In this paper we introduce the use of integral histograms (IH) for document analysis. IH take advantage of the great increase of the memory size available on computers over time. By storing selected histogram features into each pixel position, several image filters can be calculated within constant complexity. In other words, time complexity is remarkably reduced by using more memory. While IH received much attention in the computer vision field, they have not been intensively investigated for document analysis so far. As a first step into this direction, we analyze IH for the toy problem of image binarization which is a prerequisite for many graphics and text recognition systems. The results of our participation in the HDIBCO2010 competition as well as our experiments with all DIBCO datasets show the capabilities of this novel method for Document Image analysis.
Anguelos Nicolaou, Rolf Ingold, Marcus Liwicki
Improved Contour-Based Corner Detection for Architectural Floor Plans
Abstract
A new rotation invariant corner detection method for architectural line drawing images is proposed in this paper. The proposed method is capable of finding corners of objects in line drawing images by filtering out unnecessary points without changing the overall structure. Especially, in case of diagonal lines and corners, our method is capable of removing repetitive points. The proposed method is applied to corner detection of walls in floor plans which in turn are used for detection of wall edges. To evaluate the effectiveness of detected corners, gap closing and wall edge detection is performed on a publicly available dataset of 90 floor plans, where we achieved a recognition and detection accuracy of 95 %.
Max Feltes, Sheraz Ahmed, Andreas Dengel, Marcus Liwicki

Performance Evaluation and Ground Truthing

Frontmatter
The ICDAR/GREC 2013 Music Scores Competition: Staff Removal
Abstract
The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
Alicia Fornés, Van Cuong Kieu, Muriel Visani, Nicholas Journet, Anjan Dutta
Interpretation, Evaluation and the Semantic Gap ... What if We Were on a Side-Track?
Abstract
A significant amount of research in Document Image Analysis, and Machine Perception in general, relies on the extraction and analysis of signal cues with the goal of interpreting them into higher level information. This paper gives an overview on how this interpretation process is usually considered, and how the research communities proceed in evaluating existing approaches and methods developed for realizing these processes. Evaluation being an essential part to measuring the quality of research and assessing the progress of the state-of-the art, our work aims at showing that classical evaluation methods are not necessarily well suited for interpretation problems, or, at least, that they introduce a strong bias, not necessarily visible at first sight, and that new ways of comparing methods and measuring performance are necessary. It also shows that the infamous Semantic Gap seems to be an inherent and unavoidable part of the general interpretation process, especially when considered within the framework of traditional evaluation. The use of Formal Concept Analysis is put forward to leverage these limitations into a new tool to the analysis and comparison of interpretation contexts.
Bart Lamiroy
Final Report of GREC’13 Arc and Line Segmentation Contest
Abstract
Recognition of geometric primitives such as line and arc helps in automatic conversion of line drawing document images into electronic form. A large number of raster to vector methods can be found in the literature. A line and arc segmentation contest was held in conjunction with the tenth IAPR International Workshop on Graphics Recognition (GREC 2013) for comparing performance of different methods on a uniform platform. The contest was broken down into two challenges: arc segmentation and line segmentation. The dataset includes engineering drawings (for arc segmentation challenge) and cadastral maps (for line segmentation challenge). Jianping Wu’s method got the highest score (0.541), hence the winner of the Arc Segmentation Contest. Liu Wenyin’s method, the only method participated in the line segmentation contest achieved 66 % segmentation accuracy.
Syed Saqib Bukhari, Hasan S. M. Al-Khaffaf, Faisal Shafait, Mohd Azam Osman, Abdullah Zawawi Talib, Thomas M. Breuel
Datasets for the Evaluation of Substitution-Tolerant Subgraph Isomorphism
Abstract
Due to their representative power, structural descriptions have gained a great interest in the community working on graphics recognition. Indeed, graph based representations have successful been used for isolated symbol recognition. New challenges in this research field have focused on symbol recognition, symbol spotting or symbol based indexing of technical drawing.
When they are based on structural descriptions, these tasks can be expressed by means of a subgraph isomorphism search. Indeed, it consists in locating the instance of a pattern graph representing a symbol in a target graph representing the whole document image. However, there is a lack of publicly available datasets allowing to evaluate the performance of subgraph isomorphism approaches in presence of noisy data.
In this paper, we present five datasets that can be used to evaluate the performance of algorithms on several tasks involving subgraph isomorphism. Four of these datasets have been synthetically generated and allow to evaluate the search of a single instance of the pattern with or without perturbed labels. The fifth dataset corresponds to the structural description of architectural plans and allows to evaluate the search of multiple occurrences of the pattern. These datasets are made available for download. We also propose several measures to qualify each of the tasks.
Pierre Héroux, Pierre Le Bodic, Sébastien Adam
Evaluation of Diagrams Produced by Text-to-Graphic Conversion Systems
Abstract
A piece of text that basically describes a graphic (or diagram) often appears in many branches of science and engineering. Researchers have attempted to involve machine in drawing the underlying graphics after automatically understanding the text but automatic evaluation of the accuracy of such drawing remained unexplored. This paper aims at measuring the accuracy of the graphic which comes from a text-to-graphic conversion process. Experiments show that this evaluation problem poses several challenges which have not been addressed before and hence calls for new initiatives. School level geometry problems have been taken as reference to demonstrate the underlying challenges and related issues.
Anirban Mukherjee, Utpal Garain, Arindam Biswas
Backmatter
Metadaten
Titel
Graphics Recognition. Current Trends and Challenges
herausgegeben von
Bart Lamiroy
Jean-Marc Ogier
Copyright-Jahr
2014
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-44854-0
Print ISBN
978-3-662-44853-3
DOI
https://doi.org/10.1007/978-3-662-44854-0