skip to main content
research-article

How do humans sketch objects?

Published:01 July 2012Publication History
Skip Abstract Section

Abstract

Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available to all humans. This paper is the first large scale exploration of human sketches. We analyze the distribution of non-expert sketches of everyday objects such as 'teapot' or 'car'. We ask humans to sketch objects of a given category and gather 20,000 unique sketches evenly distributed over 250 object categories. With this dataset we perform a perceptual study and find that humans can correctly identify the object category of a sketch 73% of the time. We compare human performance against computational recognition methods. We develop a bag-of-features sketch representation and use multi-class support vector machines, trained on our sketch dataset, to classify sketches. The resulting recognition method is able to identify unknown sketches with 56% accuracy (chance is 0.4%). Based on the computational model, we demonstrate an interactive sketch recognition system. We release the complete crowd-sourced dataset of sketches to the community.

Skip Supplemental Material Section

Supplemental Material

tp117_12.mp4

mp4

37.8 MB

References

  1. Chalechale, A., Naghdy, G., and Mertins, A. 2005. Sketch-based image matching using angular partitioning. IEEE Trans. Systems, Man and Cybernetics, Part A 35, 1, 28--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chen, T., Cheng, M., Tan, P., Shamir, A., and Hu, S. 2009. Sketch2Photo: internet image montage. ACM Trans. Graph. (Proc. SIGGRAPH ASIA) 28, 5, 124:1--124:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Datta, R., Joshi, D., Li, J., and Wang, J. 2008. Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys 40, 2, 1--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dixon, D., Prasad, M., and Hammond, T. 2010. iCanDraw?: using sketch recognition and corrective feedback to assist a user in drawing human faces. In Proc. Int'l. Conf. on Human Factors in Computing Systems, 897--906. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2011. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Visualization and Computer Graphics 17, 11, 1624--1636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eitz, M., Richter, R., Hildebrand, K., Boubekeur, T., and Alexa, M. 2011. Photosketcher: interactive sketch-based image synthesis. IEEE Computer Graphics and Applications 31, 6, 56--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Eitz, M., Richter, R., Boubekeur, T., Hildebrand, K., and Alexa, M. 2012. Sketch-based shape retrieval. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4. to appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. 2010. The PASCAL visual object classes (VOC) challenge. Int'l. Journal of Computer Vision 88, 2, 303--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Fu, H., Zhou, S., Liu, L., and Mitra, N. 2011. Animated construction of line drawings. ACM Trans. Graph. (Proc. SIGGRAPH ASIA) 30, 6, 133:1--133:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Garland, M., and Heckbert, P. 1997. Surface simplification using quadric error metrics. in Proc. SIGGRAPH, 209--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Georgescu, B., Shimshoni, I., and Meer, P. 2003. Mean shift based clustering in high dimensions: a texture classification example. in IEEE Int'l. Conf. Computer Vision, 456--463. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep., California institute of Technology.Google ScholarGoogle Scholar
  13. Hammond, T., and Davis, R. 2005. LADDER, a sketching language for user interface developers. Computers & Graphics 29, 4, 518--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Herot, C. F. 1976. Graphical input through machine recognition of sketches. Computer Graphics (Proc. SIGGRAPH) 10, 2, 97--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. LaViola Jr., J. J., and Zeleznik, R. 2007. MathPad: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph. (Proc. SIGGRAPH) 23, 3, 432--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE Conf. Computer Vision and Pattern Recognition, 2169--2178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lee, Y., Zitnick, C., and Cohen, M. 2011. ShadowDraw: real-time user guidance for freehand drawing. ACM Trans. Graph. (Proc. SIGGRAPH) 30, 4, 27:1--27:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int'l. Journal of Computer Vision 60, 2, 91--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ouyang, T., and Davis, R. 2011. ChemInk: a natural real-time recognition system for chemical drawings. In Proc. Int'l. Conf. Intelligent User Interfaces, 267--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Paulson, B., and Hammond, T. 2008. PaleoSketch: accurate primitive sketch recognition and beautification. In Proc. Int'l. Conf. Intelligent User Interfaces, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2008. Lost in quantization: improving particular object retrieval in large scale image databases. In IEEE Conf. Computer Vision and Pattern Recognition, 1--8.Google ScholarGoogle Scholar
  22. Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: a database and web-based tool for image annotation. Int'l Journal of Computer Vision 77, 1, 157--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Samet, H. 2006. Foundations of multidimensional and metric data structures. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Schölkopf, B., and Smola, A. 2002. Learning with kernels. MIT Press.Google ScholarGoogle Scholar
  25. Sezgin, T. M., Stahovich, T., and Davis, R. 2001. Sketch based interfaces: early processing for sketch understanding. In Workshop on Perceptive User Interfaces, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shilane, P., Min, P., Kazhdan, M., and Funkhouser, T. 2004. The Princeton Shape Benchmark. In Shape Modeling International, 167--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Shrivastava, A., Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph.. (Proc. SIGGRAPH ASIA) 30, 6, 154:1--154:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sivic, J., and Zisserman, A. 2003. Video Google: a textretrieval approach to object matching in videos. In IEEE Int'l. Conf. Computer Vision, 1470--1477. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sutherland, I. 1964. SketchPad: a man-machine graphical communication system. In Proc. AFIPS, 323--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. van der Maaten, L., and Hinton, G. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579--2605.Google ScholarGoogle Scholar
  31. Walther, D., Chai, B., Caddigan, E., Beck, D., and FeiFei, L. 2011. Simple line drawings suffice for functional MRI decoding of natural scene categories. Proc. National Academy of Sciences 108, 23, 9661--9666.Google ScholarGoogle ScholarCross RefCross Ref
  32. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. 2010. SUN database: large-scale scene recognition from abbey to zoo. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, 3485--3492.Google ScholarGoogle Scholar

Index Terms

  1. How do humans sketch objects?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 31, Issue 4
        July 2012
        935 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2185520
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 July 2012
        Published in tog Volume 31, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader