ABSTRACT
Poorly designed charts are prevalent in reports, magazines, books and on the Web. Most of these charts are only available as bitmap images; without access to the underlying data it is prohibitively difficult for viewers to create more effective visual representations. In response we present ReVision, a system that automatically redesigns visualizations to improve graphical perception. Given a bitmap image of a chart as input, ReVision applies computer vision and machine learning techniques to identify the chart type (e.g., pie chart, bar chart, scatterplot, etc.). It then extracts the graphical marks and infers the underlying data. Using a corpus of images drawn from the web, ReVision achieves image classification accuracy of 96% across ten chart categories. It also accurately extracts marks from 79% of bar charts and 62% of pie charts, and from these charts it successfully extracts data from 71% of bar charts and 64% of pie charts. ReVision then applies perceptually-based design principles to populate an interactive gallery of redesigned charts. With this interface, users can view alternative chart designs and retarget content to different visual styles.
- A. Bosch, A. Zisserman, and X. Munoz. Scene classification via pLSA. Computer Vision--ECCV, pages 517--530, 2006. Google ScholarDigital Library
- M. Bostock and J. Heer. Protovis: A graphical toolkit for visualization. IEEE Trans Visualization & Comp Graphics, 15(6):1121--1128, 2009. Google ScholarDigital Library
- M. Boutell, C. Brown, and J. Luo. Review of the state of the art in semantic scene classification. Rochester, NY, USA, Tech. Rep, 2002. Google ScholarDigital Library
- D. Chen, J. Odobez, and H. Bourlard. Text detection and recognition in images and video frames. Pattern Recognition, 37(3):595--608, 2004.Google ScholarCross Ref
- W. S. Cleveland. Visualizing Data. Hobart Press, 1993. Google ScholarDigital Library
- W. S. Cleveland and R. McGill. Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American Statistical Association, 79(387):531--554, 1984.Google ScholarCross Ref
- A. Coates, H. Lee, and A. Ng. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. Advances in Neural Information Processing Systems, 2010.Google Scholar
- C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20:273--297, 1995. Google ScholarDigital Library
- N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE CVPR, pages 886--893, 2005. Google ScholarDigital Library
- G. David. Distinctive image features from scale-invariant keypoints. Intl Journal Comp Vision, 60(2):91--110, 2004. Google ScholarDigital Library
- S. Few. Show Me the Numbers: Designing Tables and Graphs to Enlighten. Analytics Press, Berkeley, CA, 2004.Google ScholarDigital Library
- M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24:381--395, June 1981. Google ScholarDigital Library
- A. Fitzgibbon, M. Pilu, and R. Fisher. Direct least square fitting of ellipses. IEEE Trans Pattern Analysis & Machine Intelligence, 21(5):476 --480, 1999. Google ScholarDigital Library
- S. R. Garner. Weka: The waikato environment for knowledge analysis. In In Proc. of the New Zealand Computer Science Research Students Conference, pages 57--64, 1995.Google Scholar
- M. Harrower and C. Brewer. Colorbrewer.org: an online tool for selecting colour schemes for maps. The Cartographic Journal, 40(1):27--37, 2003.Google ScholarCross Ref
- J. Heer and M. Bostock. Crowdsourcing graphical perception: Using Mechanical Turk to assess visualization design. In ACM CHI, pages 203--212, 2010. Google ScholarDigital Library
- W. Huang and C. L. Tan. A system for understanding imaged infographics and its applications. In Proceedings of the 2007 ACM symposium on Document engineering, DocEng '07, pages 9--18, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- W. Huang, C. L. Tan, and W. K. Leow. Model-based chart image recognition. In J. Lladós and Y.-B. Kwon, editors, Graphics Recognition, volume 3088 of Lecture Notes in Computer Science, pages 87--99. Springer Berlin / Heidelberg, 2004.Google Scholar
- R. Liu, W. Huang, and C. L. Tan. Extraction of vectorized graphical information from scientific chart images. In Document Analysis & Recognition (ICDAR), pages 521--525, 2007. Google ScholarDigital Library
- J. D. Mackinlay. Automating the design of graphical presentations of relational information. ACM Trans on Graphics, 5(2):110--141, 1986. Google ScholarDigital Library
- J. D. Mackinlay, P. Hanrahan, and C. Stolte. Show me: Automatic presentation for visual analysis. IEEE Trans Visualization & Comp Graphics, 13(6):1137 --1144, 2007. Google ScholarDigital Library
- V. Prasad, B. Siddiquie, J. Golbeck, and L. Davis. Classifying Computer Generated Charts. In Content-Based Multimedia Indexing Workshop, pages 85--92. IEEE, 2007.Google Scholar
- M. Shao and R. Futrelle. Recognition and classification of figures in pdf documents. In W. Liu and J. Lladós, editors, Graphics Recognition. Ten Years Review and Future Perspectives, volume 3926 of Lecture Notes in Computer Science, pages 231--242. Springer Berlin / Heidelberg, 2006. Google ScholarDigital Library
- D. Simkin and R. Hastie. An information-processing analysis of graph perception. Journal of the American Statistical Association, 82(398):454--465, 1987.Google ScholarCross Ref
- C. Stolte, D. Tang, and P. Hanrahan. Polaris: a system for query, analysis, and visualization of multidimensional relational databases. IEEE Trans Visualization & Comp Graphics, 8(1):52 --65, 2002. Google ScholarDigital Library
- M. Stone. A Field Guide to Digital Color. A. K. Peters, 2003. Google ScholarDigital Library
- C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In ICCV, pages 839 --846, Jan. 1998. Google ScholarDigital Library
- E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, 1983. Google ScholarDigital Library
- J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo. Evaluating bag-of-visual-words representations in scene classification. In Workshop on Multimedia Information Retrieval, pages 197--206, 2007. Google ScholarDigital Library
- L. Yang, W. Huang, and C. Tan. Semi-automatic ground truth generation for chart image recognition. In Document Analysis Systems VII, volume 3872 of Lecture Notes in Computer Science, pages 324--335. 2006. Google Scholar
- Y. P. Zhou and C. L. Tan. Hough technique for bar charts detection and recognition in document images. In Intl Conf on Image Processing, pages 605--608, sept. 2000.Google Scholar
Index Terms
- ReVision: automated classification, analysis and redesign of chart images
Recommendations
Automatic Extraction of Data from Bar Charts
K-CAP '15: Proceedings of the 8th International Conference on Knowledge CaptureScientific charts are an effective tool to visualize numerical data trends. They appear in a wide range of contexts, from experimental results in scientific papers to statistical analyses in business reports. The abundance of scientific charts in the ...
Deconstructing and restyling D3 visualizations
UIST '14: Proceedings of the 27th annual ACM symposium on User interface software and technologyThe D3 JavaScript library has become a ubiquitous tool for developing visualizations on the Web. Yet, once a D3 visualization is published online its visual style is difficult to change. We present a pair of tools for deconstructing and restyling ...
Searching for Extreme Portions in Distributions: A Comparison of Pie and Bar Charts
Cooperative Design, Visualization, and EngineeringAbstractAggregated data visualizations are often used by collaborative teams to gain a common understanding of a complex situations and issues. Pie and bar charts are both widely used for visualizing distributions. The study of pie versus bar charts has a ...
Comments