Skip to main content
Erschienen in: Discover Computing 5-6/2014

01.10.2014 | Information Retrieval in the Intellectual Property Domain

Flowchart recognition for non-textual information retrieval in patent search

verfasst von: Marçal Rusiñol, Lluís-Pere de las Heras, Oriol Ramos Terrades

Erschienen in: Discover Computing | Ausgabe 5-6/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adams, S. (2005). Electronic non-text material in patent applications—some questions for patent offices, applicants and searchers. World Patent Information, 27(2), 99–103.CrossRef Adams, S. (2005). Electronic non-text material in patent applications—some questions for patent offices, applicants and searchers. World Patent Information, 27(2), 99–103.CrossRef
Zurück zum Zitat Blostein, D. (1996). General diagram-recognition methodologies. In: Graphics recognition methods and applications, lecture notes in computer science, Vol. 1072, pp. 106–122. Berlin: Springer. Blostein, D. (1996). General diagram-recognition methodologies. In: Graphics recognition methods and applications, lecture notes in computer science, Vol. 1072, pp. 106–122. Berlin: Springer.
Zurück zum Zitat Bunke, H. (1982). Attributed programmed graph grammars and their application to schematic diagram interpretation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4(6), 574–582.CrossRefMATH Bunke, H. (1982). Attributed programmed graph grammars and their application to schematic diagram interpretation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4(6), 574–582.CrossRefMATH
Zurück zum Zitat Bunke, H., & Shearer, K. (1998). A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters, 19(3–4), 255–259.CrossRefMATH Bunke, H., & Shearer, K. (1998). A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters, 19(3–4), 255–259.CrossRefMATH
Zurück zum Zitat Codina, J., Pianta, E., Vrochidis, S., & Papadopoulos, S. (2008). Integration of semantic, metadata and image search engines with a text search engine for patent retrieval. In: Proceedings of the workshop on semantic search at the fifth European semantic web conference, pp. 14–28. Codina, J., Pianta, E., Vrochidis, S., & Papadopoulos, S. (2008). Integration of semantic, metadata and image search engines with a text search engine for patent retrieval. In: Proceedings of the workshop on semantic search at the fifth European semantic web conference, pp. 14–28.
Zurück zum Zitat Csurka, G., Renders, J., & Jacquet, G. (2011). XRCE’s participation at patent image classification and image-based patent retrieval tasks of the CLEF-IP 2011. In: CLEF 2011 evaluation labs and workshop, Online Working Notes. Csurka, G., Renders, J., & Jacquet, G. (2011). XRCE’s participation at patent image classification and image-based patent retrieval tasks of the CLEF-IP 2011. In: CLEF 2011 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Duda, R., Hart, P. E., & Stork, D. G. (2001) Pattern classification. New York: Wiley-Interscience.MATH Duda, R., Hart, P. E., & Stork, D. G. (2001) Pattern classification. New York: Wiley-Interscience.MATH
Zurück zum Zitat Escalera, S., Fornés, A., Pujol, O., Radeva, P., & Lladós, G. S. J. (2009). Blurred shape model for binary and grey-level symbol recognition. Pattern Recognition Letters, 30(15), 1424–1433.CrossRef Escalera, S., Fornés, A., Pujol, O., Radeva, P., & Lladós, G. S. J. (2009). Blurred shape model for binary and grey-level symbol recognition. Pattern Recognition Letters, 30(15), 1424–1433.CrossRef
Zurück zum Zitat Fletcher, L., & Kasturi, R. (1988). A robust algorithm for text string separation from mixed text/graphics images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6), 910–918.CrossRef Fletcher, L., & Kasturi, R. (1988). A robust algorithm for text string separation from mixed text/graphics images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6), 910–918.CrossRef
Zurück zum Zitat Hanbury, A., Bhatti, N., Lupu, M., & Mörzinger, R. (2011). Patent image retrieval: A survey. In: Proceedings of the fourth workshop on patent information retrieval, pp. 3–8. Hanbury, A., Bhatti, N., Lupu, M., & Mörzinger, R. (2011). Patent image retrieval: A survey. In: Proceedings of the fourth workshop on patent information retrieval, pp. 3–8.
Zurück zum Zitat Huet, B., Kern, N., Guarascio, G., & Merialdo, B. (2001). Relational skeletons for retrieval in patent drawings. In: Proceedings of the international conference on image processing, pp. 737–740. Huet, B., Kern, N., Guarascio, G., & Merialdo, B. (2001). Relational skeletons for retrieval in patent drawings. In: Proceedings of the international conference on image processing, pp. 737–740.
Zurück zum Zitat Jiang, X., & Bunke, H. (1993) An optimal algorithm for extracting the regions of a plane graph. Pattern Recognition Letters, 14(7), 553–558.CrossRefMATH Jiang, X., & Bunke, H. (1993) An optimal algorithm for extracting the regions of a plane graph. Pattern Recognition Letters, 14(7), 553–558.CrossRefMATH
Zurück zum Zitat Lamiroy, B., Najman, L., Ehrard, R., Louis, C., Quelin, F., Rouyer, N., & Zeghache, N. (2001). Scan-to-XML for vector graphics: An experimental setup for intelligent browsable document generation. In: Proceedings of the 4th IAPR international workshop on graphics recognition, pp. 312–325. Lamiroy, B., Najman, L., Ehrard, R., Louis, C., Quelin, F., Rouyer, N., & Zeghache, N. (2001). Scan-to-XML for vector graphics: An experimental setup for intelligent browsable document generation. In: Proceedings of the 4th IAPR international workshop on graphics recognition, pp. 312–325.
Zurück zum Zitat Lew, M., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications, 2(1), 1–19.CrossRef Lew, M., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications, 2(1), 1–19.CrossRef
Zurück zum Zitat Lin, X., Shimotsuji, S., Minoh, M., & Sakai, T. (1985). Efficient diagram understanding with characteristic pattern detection. Computer Vision, Graphics, and Image Processing, 30(1), 84–106.CrossRef Lin, X., Shimotsuji, S., Minoh, M., & Sakai, T. (1985). Efficient diagram understanding with characteristic pattern detection. Computer Vision, Graphics, and Image Processing, 30(1), 84–106.CrossRef
Zurück zum Zitat List, J. (2007). How drawings could enhance retrieval in mechanical and device patent searching. World Patent Information, 29(3), 210–218.CrossRef List, J. (2007). How drawings could enhance retrieval in mechanical and device patent searching. World Patent Information, 29(3), 210–218.CrossRef
Zurück zum Zitat Lladós, J., & Rusiñol, M. (2013). Handbook of document image processing and recognition, Chap Graphics Recognition Techniques. Berlin: Springer Lladós, J., & Rusiñol, M. (2013). Handbook of document image processing and recognition, Chap Graphics Recognition Techniques. Berlin: Springer
Zurück zum Zitat Lupu, M., Schuster, R., Mörzinger, R., Piroi, F., Schleser, T., & Hanbury, A. (2012). Patent images—a glass-encased tool: opening the case. In: Proceedings of the twelveth international conference on knowledge management and knowledge technologies. Lupu, M., Schuster, R., Mörzinger, R., Piroi, F., Schleser, T., & Hanbury, A. (2012). Patent images—a glass-encased tool: opening the case. In: Proceedings of the twelveth international conference on knowledge management and knowledge technologies.
Zurück zum Zitat Mahmoudi, F., Shanbehzadeh, J., Eftekhari-Moghadam, A., & Soltanian-Zadeh, H. (2003). Image retrieval based on shape similarity by edge orientation autocorrelogram. Pattern Recognition, 36(8), 1725–1736.CrossRef Mahmoudi, F., Shanbehzadeh, J., Eftekhari-Moghadam, A., & Soltanian-Zadeh, H. (2003). Image retrieval based on shape similarity by edge orientation autocorrelogram. Pattern Recognition, 36(8), 1725–1736.CrossRef
Zurück zum Zitat Mörzinger, R., Horti, A., Thallinger, G., Bhatti, N., & Hanbury, A. (2011). Classifying patent images. In: CLEF 2011 evaluation labs and workshop, Online Working Notes. Mörzinger, R., Horti, A., Thallinger, G., Bhatti, N., & Hanbury, A. (2011). Classifying patent images. In: CLEF 2011 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Mörzinger, R., Schuster, R., Horti, A., & Thallinger, G. (2012). Visual structure analysis of flow charts in patent images. In: CLEF 2012 evaluation labs and workshop, Online Working Notes. Mörzinger, R., Schuster, R., Horti, A., & Thallinger, G. (2012). Visual structure analysis of flow charts in patent images. In: CLEF 2012 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Piroi, F., Lupu, M., Hanbury, A., & Zenz, V. (2011). CLEF-IP 2011: Retrieval in the intellectual property domain. In: CLEF 2011 evaluation labs and workshop, Online Working Notes. Piroi, F., Lupu, M., Hanbury, A., & Zenz, V. (2011). CLEF-IP 2011: Retrieval in the intellectual property domain. In: CLEF 2011 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Piroi, F., Lupu, M., Hanbury, A., Sexton, A., Magdy, W., & Filippov, I. (2012). CLEF-IP 2012: Retrieval experiments in the intellectual property domain. In: CLEF 2012 evaluation labs and workshop, Online Working Notes. Piroi, F., Lupu, M., Hanbury, A., Sexton, A., Magdy, W., & Filippov, I. (2012). CLEF-IP 2012: Retrieval experiments in the intellectual property domain. In: CLEF 2012 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Rosin, P., & West, G. (1989). Segmentation of edges into lines and arcs. Image and Vision Computing, 7(2), 109–114.CrossRef Rosin, P., & West, G. (1989). Segmentation of edges into lines and arcs. Image and Vision Computing, 7(2), 109–114.CrossRef
Zurück zum Zitat Rusiñol, M., de las Heras, L., Mas, J., Terrades, O., Karatzas, D., Dutta, A., Sánchez, G., & Lladós, J. (2012). CVC-UAB’s participation in the flowchart recognition task of CLEF-IP 2012. In: CLEF 2012 evaluation labs and workshop, Online Working Notes. Rusiñol, M., de las Heras, L., Mas, J., Terrades, O., Karatzas, D., Dutta, A., Sánchez, G., & Lladós, J. (2012). CVC-UAB’s participation in the flowchart recognition task of CLEF-IP 2012. In: CLEF 2012 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Samet, H., & Webber, R. (1985). Storing a collection of polygons using quadtrees. ACM Transactions on Graphics, 4(3), 182–222.CrossRef Samet, H., & Webber, R. (1985). Storing a collection of polygons using quadtrees. ACM Transactions on Graphics, 4(3), 182–222.CrossRef
Zurück zum Zitat Sidiropoulos, P., Vrochidis, S., & Kompatsiaris, I. (2011). Content-based binary image retrieval using the adaptive hierarchicaldensity histogram. Pattern Recognition, 44(4):739–750.CrossRef Sidiropoulos, P., Vrochidis, S., & Kompatsiaris, I. (2011). Content-based binary image retrieval using the adaptive hierarchicaldensity histogram. Pattern Recognition, 44(4):739–750.CrossRef
Zurück zum Zitat Szwoch, W. (2007). Recognition, understanding and aestheticization of freehand drawing flowcharts. In: Proceedings of the ninth international conference on document analysis and recognition, pp. 1138–1142. Szwoch, W. (2007). Recognition, understanding and aestheticization of freehand drawing flowcharts. In: Proceedings of the ninth international conference on document analysis and recognition, pp. 1138–1142.
Zurück zum Zitat Thean, A., Deltorn, J., Lopez, P., & Romary, L. (2012). Textual summarisation of flowcharts in patent drawings for CLEF-IP2012. In: CLEF 2012 evaluation labs and workshop, Online Working Notes. Thean, A., Deltorn, J., Lopez, P., & Romary, L. (2012). Textual summarisation of flowcharts in patent drawings for CLEF-IP2012. In: CLEF 2012 evaluation labs and workshop, Online Working Notes.
Zurück zum Zitat Tiwari, A., & Bansal, V. (2004). PATSEEK: Content based image retrieval system for patent database. In: Proceedings of the fourth international conference on electronic business, pp. 1167–1171. Tiwari, A., & Bansal, V. (2004). PATSEEK: Content based image retrieval system for patent database. In: Proceedings of the fourth international conference on electronic business, pp. 1167–1171.
Zurück zum Zitat Tombre, K., Ah-Soon, C., Dosch, P., Massini, G., & Tabbone, S. (2000). Stable and robust vectorization: How to make the right choices. In: Graphics recognition recent advances, lecture notes in computer science, Vol. 1941, pp. 3–18. Berlin: Springer. Tombre, K., Ah-Soon, C., Dosch, P., Massini, G., & Tabbone, S. (2000). Stable and robust vectorization: How to make the right choices. In: Graphics recognition recent advances, lecture notes in computer science, Vol. 1941, pp. 3–18. Berlin: Springer.
Zurück zum Zitat Tombre, K., Tabbone, S., Pélissier, L., Lamiroy, B., & Dosch, P. (2002). Text/graphics separation revisited. In: Document analysis systems, lecture notes in computer science, Vol. 2423, pp 615–620. Berlin: Springer. Tombre, K., Tabbone, S., Pélissier, L., Lamiroy, B., & Dosch, P. (2002). Text/graphics separation revisited. In: Document analysis systems, lecture notes in computer science, Vol. 2423, pp 615–620. Berlin: Springer.
Zurück zum Zitat Valveny, E., & Lamiroy, B. (2002). Scan-to-XML: Automatic generation of browsable technical documents. In: Proceedings of the 16th international conference on pattern recognition, pp. 188–191. Valveny, E., & Lamiroy, B. (2002). Scan-to-XML: Automatic generation of browsable technical documents. In: Proceedings of the 16th international conference on pattern recognition, pp. 188–191.
Zurück zum Zitat Vasudevan, B., Dhanapanichkul, S., & Balakrishnan, R. (2008). Flowchart knowledge extraction on image processing. In: Proceedings of the IEEE international joint conference on neural networks, pp. 4075–4082. Vasudevan, B., Dhanapanichkul, S., & Balakrishnan, R. (2008). Flowchart knowledge extraction on image processing. In: Proceedings of the IEEE international joint conference on neural networks, pp. 4075–4082.
Zurück zum Zitat Vrochidis, S., Papadopoulos, S., Moumtzidou, A., Sidiropoulos, P., Pianta, E., & Kompatsiaris, I. (2010). Towards content-based patent image retrieval: A framework perspective. World Patent Information, 32(2), 94–106.CrossRef Vrochidis, S., Papadopoulos, S., Moumtzidou, A., Sidiropoulos, P., Pianta, E., & Kompatsiaris, I. (2010). Towards content-based patent image retrieval: A framework perspective. World Patent Information, 32(2), 94–106.CrossRef
Zurück zum Zitat Vrochidis, S., Moumtzidou, A., & Kompatsiaris, I. (2012). Concept-based patent image retrieval. World Patent Information, 34(4), 292–303.CrossRef Vrochidis, S., Moumtzidou, A., & Kompatsiaris, I. (2012). Concept-based patent image retrieval. World Patent Information, 34(4), 292–303.CrossRef
Zurück zum Zitat Wallis, W., Shoubridge, P., Kraetz, M., & Ray, D. (2001). Graph distances using graph union. Pattern Recognition Letters, 22(6–7), 701–704.CrossRefMATH Wallis, W., Shoubridge, P., Kraetz, M., & Ray, D. (2001). Graph distances using graph union. Pattern Recognition Letters, 22(6–7), 701–704.CrossRefMATH
Zurück zum Zitat Yu, Y., Samal, A., & Seth, S. (1997). A system for recognizing a large class of engineering drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(8), 868–890.CrossRef Yu, Y., Samal, A., & Seth, S. (1997). A system for recognizing a large class of engineering drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(8), 868–890.CrossRef
Zurück zum Zitat Yuan, Z., Pan, H., & Zhang, L. (2008). A novel pen-based flowchart recognition system for programming teaching. In: Advances in blended learning, lecture notes in computer science, Vol. 5328, pp 55–64. Berlin: Springer. Yuan, Z., Pan, H., & Zhang, L. (2008). A novel pen-based flowchart recognition system for programming teaching. In: Advances in blended learning, lecture notes in computer science, Vol. 5328, pp 55–64. Berlin: Springer.
Zurück zum Zitat Zhang, D., & Lu, G. (2002). A comparative study of three region shape descriptors. In: Proceedings of the digital image computing techniques and applications, pp. 1–6. Zhang, D., & Lu, G. (2002). A comparative study of three region shape descriptors. In: Proceedings of the digital image computing techniques and applications, pp. 1–6.
Zurück zum Zitat Zhang, D., & Lu, G. (2004). Review of shape representation and description techniques. Pattern Recognition, 37, 1–19.CrossRefMATH Zhang, D., & Lu, G. (2004). Review of shape representation and description techniques. Pattern Recognition, 37, 1–19.CrossRefMATH
Metadaten
Titel
Flowchart recognition for non-textual information retrieval in patent search
verfasst von
Marçal Rusiñol
Lluís-Pere de las Heras
Oriol Ramos Terrades
Publikationsdatum
01.10.2014
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 5-6/2014
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-013-9234-3

Weitere Artikel der Ausgabe 5-6/2014

Discover Computing 5-6/2014 Zur Ausgabe

Information Retrieval in the Intellectual Property Domain

Using multiple query representations in patent prior-art search

Information Retrieval in the Intellectual Property Domain

Wikipedia-based query phrase expansion in patent class search

Information Retrieval in the Intellectual Property Domain

The effect of citation analysis on query expansion for patent retrieval

Premium Partner