research-article

ReVision: automated classification, analysis and redesign of chart images

Authors:
Manolis Savva

Stanford University, Palo Alto, CA, USA

Stanford University, Palo Alto, CA, USA
View Profile

,
Nicholas Kong

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

,
Arti Chhajta

Stanford University, Palo Alto, CA, USA

Stanford University, Palo Alto, CA, USA
View Profile

,
Li Fei-Fei

Stanford University, Palo Alto, CA, USA

Stanford University, Palo Alto, CA, USA
View Profile

,
Maneesh Agrawala

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

,
Jeffrey Heer

Stanford University, Palo Alto, CA, USA

Stanford University, Palo Alto, CA, USA
View Profile

UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technologyOctober 2011Pages 393–402https://doi.org/10.1145/2047196.2047247

Published:16 October 2011Publication History

UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology

Pages 393–402

ABSTRACT

Poorly designed charts are prevalent in reports, magazines, books and on the Web. Most of these charts are only available as bitmap images; without access to the underlying data it is prohibitively difficult for viewers to create more effective visual representations. In response we present ReVision, a system that automatically redesigns visualizations to improve graphical perception. Given a bitmap image of a chart as input, ReVision applies computer vision and machine learning techniques to identify the chart type (e.g., pie chart, bar chart, scatterplot, etc.). It then extracts the graphical marks and infers the underlying data. Using a corpus of images drawn from the web, ReVision achieves image classification accuracy of 96% across ten chart categories. It also accurately extracts marks from 79% of bar charts and 62% of pie charts, and from these charts it successfully extracts data from 71% of bar charts and 64% of pie charts. ReVision then applies perceptually-based design principles to populate an interactive gallery of redesigned charts. With this interface, users can view alternative chart designs and retarget content to different visual styles.

References

A. Bosch, A. Zisserman, and X. Munoz. Scene classification via pLSA. Computer Vision--ECCV, pages 517--530, 2006. Google ScholarDigital Library
M. Bostock and J. Heer. Protovis: A graphical toolkit for visualization. IEEE Trans Visualization & Comp Graphics, 15(6):1121--1128, 2009. Google ScholarDigital Library
M. Boutell, C. Brown, and J. Luo. Review of the state of the art in semantic scene classification. Rochester, NY, USA, Tech. Rep, 2002. Google ScholarDigital Library
D. Chen, J. Odobez, and H. Bourlard. Text detection and recognition in images and video frames. Pattern Recognition, 37(3):595--608, 2004.Google ScholarCross Ref
W. S. Cleveland. Visualizing Data. Hobart Press, 1993. Google ScholarDigital Library
W. S. Cleveland and R. McGill. Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American Statistical Association, 79(387):531--554, 1984.Google ScholarCross Ref
A. Coates, H. Lee, and A. Ng. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. Advances in Neural Information Processing Systems, 2010.Google Scholar
C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20:273--297, 1995. Google ScholarDigital Library
N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE CVPR, pages 886--893, 2005. Google ScholarDigital Library
G. David. Distinctive image features from scale-invariant keypoints. Intl Journal Comp Vision, 60(2):91--110, 2004. Google ScholarDigital Library
S. Few. Show Me the Numbers: Designing Tables and Graphs to Enlighten. Analytics Press, Berkeley, CA, 2004.Google ScholarDigital Library
M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24:381--395, June 1981. Google ScholarDigital Library
A. Fitzgibbon, M. Pilu, and R. Fisher. Direct least square fitting of ellipses. IEEE Trans Pattern Analysis & Machine Intelligence, 21(5):476 --480, 1999. Google ScholarDigital Library
S. R. Garner. Weka: The waikato environment for knowledge analysis. In In Proc. of the New Zealand Computer Science Research Students Conference, pages 57--64, 1995.Google Scholar
M. Harrower and C. Brewer. Colorbrewer.org: an online tool for selecting colour schemes for maps. The Cartographic Journal, 40(1):27--37, 2003.Google ScholarCross Ref
J. Heer and M. Bostock. Crowdsourcing graphical perception: Using Mechanical Turk to assess visualization design. In ACM CHI, pages 203--212, 2010. Google ScholarDigital Library
W. Huang and C. L. Tan. A system for understanding imaged infographics and its applications. In Proceedings of the 2007 ACM symposium on Document engineering, DocEng '07, pages 9--18, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
W. Huang, C. L. Tan, and W. K. Leow. Model-based chart image recognition. In J. Lladós and Y.-B. Kwon, editors, Graphics Recognition, volume 3088 of Lecture Notes in Computer Science, pages 87--99. Springer Berlin / Heidelberg, 2004.Google Scholar
R. Liu, W. Huang, and C. L. Tan. Extraction of vectorized graphical information from scientific chart images. In Document Analysis & Recognition (ICDAR), pages 521--525, 2007. Google ScholarDigital Library
J. D. Mackinlay. Automating the design of graphical presentations of relational information. ACM Trans on Graphics, 5(2):110--141, 1986. Google ScholarDigital Library
J. D. Mackinlay, P. Hanrahan, and C. Stolte. Show me: Automatic presentation for visual analysis. IEEE Trans Visualization & Comp Graphics, 13(6):1137 --1144, 2007. Google ScholarDigital Library
V. Prasad, B. Siddiquie, J. Golbeck, and L. Davis. Classifying Computer Generated Charts. In Content-Based Multimedia Indexing Workshop, pages 85--92. IEEE, 2007.Google Scholar
M. Shao and R. Futrelle. Recognition and classification of figures in pdf documents. In W. Liu and J. Lladós, editors, Graphics Recognition. Ten Years Review and Future Perspectives, volume 3926 of Lecture Notes in Computer Science, pages 231--242. Springer Berlin / Heidelberg, 2006. Google ScholarDigital Library
D. Simkin and R. Hastie. An information-processing analysis of graph perception. Journal of the American Statistical Association, 82(398):454--465, 1987.Google ScholarCross Ref
C. Stolte, D. Tang, and P. Hanrahan. Polaris: a system for query, analysis, and visualization of multidimensional relational databases. IEEE Trans Visualization & Comp Graphics, 8(1):52 --65, 2002. Google ScholarDigital Library
M. Stone. A Field Guide to Digital Color. A. K. Peters, 2003. Google ScholarDigital Library
C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In ICCV, pages 839 --846, Jan. 1998. Google ScholarDigital Library
E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, 1983. Google ScholarDigital Library
J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo. Evaluating bag-of-visual-words representations in scene classification. In Workshop on Multimedia Information Retrieval, pages 197--206, 2007. Google ScholarDigital Library
L. Yang, W. Huang, and C. Tan. Semi-automatic ground truth generation for chart image recognition. In Document Analysis Systems VII, volume 3872 of Lecture Notes in Computer Science, pages 324--335. 2006. Google Scholar
Y. P. Zhou and C. L. Tan. Hough technique for bar charts detection and recognition in document images. In Intl Conf on Image Processing, pages 605--608, sept. 2000.Google Scholar

Index Terms

ReVision: automated classification, analysis and redesign of chart images
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Automatic Extraction of Data from Bar Charts
K-CAP '15: Proceedings of the 8th International Conference on Knowledge Capture

Scientific charts are an effective tool to visualize numerical data trends. They appear in a wide range of contexts, from experimental results in scientific papers to statistical analyses in business reports. The abundance of scientific charts in the ...
Read More
Deconstructing and restyling D3 visualizations
UIST '14: Proceedings of the 27th annual ACM symposium on User interface software and technology

The D3 JavaScript library has become a ubiquitous tool for developing visualizations on the Web. Yet, once a D3 visualization is published online its visual style is difficult to change. We present a pair of tools for deconstructing and restyling ...
Read More
Searching for Extreme Portions in Distributions: A Comparison of Pie and Bar Charts
Cooperative Design, Visualization, and Engineering
Abstract
Aggregated data visualizations are often used by collaborative teams to gain a common understanding of a complex situations and issues. Pie and bar charts are both widely used for visualizing distributions. The study of pie versus bar charts has a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology
October 2011
654 pages
ISBN:9781450307161
DOI:10.1145/2047196
General Chair:
Jeff Pierce
IBM Research, USA
,
Program Chairs:
Maneesh Agrawala
University of California, Berkeley, USA
,
Scott Klemmer
Stanford University, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
chart understanding
computer vision
information extraction
redesign
visualization
Qualifiers
- research-article
Conference

Acceptance Rates
UIST '11 Paper Acceptance Rate67of262submissions,26%Overall Acceptance Rate842of3,967submissions,21%
More
Upcoming Conference
UIST '24

Sponsor:

sigchi

sigchi

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

Pittsburgh , PA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 174
  Total Citations
  View Citations
- 1,483
  Total Downloads
- Downloads (Last 12 months)165
- Downloads (Last 6 weeks)21
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ReVision: automated classification, analysis and redesign of chart images

UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic Extraction of Data from Bar Charts

Deconstructing and restyling D3 visualizations

Searching for Extreme Portions in Distributions: A Comparison of Pie and Bar Charts