Skip to main content
Erschienen in: Mobile Networks and Applications 4/2015

01.08.2015

Multi-modal Similarity Retrieval with Distributed Key-value Store

verfasst von: David Novak

Erschienen in: Mobile Networks and Applications | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a system architecture for large-scale similarity search in various types of digital data. The architecture combines contemporary highly-scalable distributed data stores with recent efficient similarity indexes and also with other types of search indexes. The system enables various types of data access by distance-based similarity queries, standard term and attribute queries, and advanced queries combining several search aspects (modalities). The first part of this work describes the generic architecture and similarity index PPP-Codes, which is suitable for our system. In the second part, we describe two specific instances of this architecture that manage two large collections of digital images and provide content-based visual search, keyword search, attribute-based access, and their combinations. The first collection is the CoPhIR benchmark with 106 million images accessed by MPEG7 visual descriptors and the second collection contains 20 million images with complex features obtained from deep convolutional neural network.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Amato G, Gennaro C, Savino P (2012) MI-File: Using inverted files for scalable approximate similarity search. Multimed Tools Appl:1–30 Amato G, Gennaro C, Savino P (2012) MI-File: Using inverted files for scalable approximate similarity search. Multimed Tools Appl:1–30
2.
Zurück zum Zitat Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: A survey. Multimed Syst 16:345–379CrossRef Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: A survey. Multimed Syst 16:345–379CrossRef
3.
Zurück zum Zitat Batko M, Falchi F, Lucchese C, Novak D, Perego R, Rabitti F, Sedmidubsky J, Zezula P (2010) Building a web-scale image similarity search system. Multimed Tools Appl 47(3):599–629CrossRef Batko M, Falchi F, Lucchese C, Novak D, Perego R, Rabitti F, Sedmidubsky J, Zezula P (2010) Building a web-scale image similarity search system. Multimed Tools Appl 47(3):599–629CrossRef
4.
Zurück zum Zitat Batko M, Kohoutkova P, Novak D (2009) CoPhIR Image Collection under the Microscope. In: Proceedings of SISAP 2009, pp. 47–54. IEEE Computer Society Batko M, Kohoutkova P, Novak D (2009) CoPhIR Image Collection under the Microscope. In: Proceedings of SISAP 2009, pp. 47–54. IEEE Computer Society
5.
Zurück zum Zitat Batko M, Novak D, Falchi F, Zezula P (2006) On scalability of the similarity search in the world of peers. In: Proceedings of InfoScale ’06. ACM Press, New York, p 12 Batko M, Novak D, Falchi F, Zezula P (2006) On scalability of the similarity search in the world of peers. In: Proceedings of InfoScale ’06. ACM Press, New York, p 12
6.
Zurück zum Zitat Batko M, Novak D, Zezula P (2007) MESSIF: Metric Similarity Search Implementation Framework. In: Digital Libraries: Research and Development, vol. LNCS 4877. Springer, pp 1–10 Batko M, Novak D, Zezula P (2007) MESSIF: Metric Similarity Search Implementation Framework. In: Digital Libraries: Research and Development, vol. LNCS 4877. Springer, pp 1–10
7.
Zurück zum Zitat Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) CoPhIR: A Test Collection for Content-Based Image Retrieval. CoRR abs/0905.4 Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) CoPhIR: A Test Collection for Content-Based Image Retrieval. CoRR abs/0905.4
8.
Zurück zum Zitat Budikova P, Batko M, Zezula P (2011) Evaluation Platform for Content-based Image Retrieval Systems. In: International Conference on Theory and Practice of Digital Libraries, LNCS. Springer Berlin, Heidelberg, pp 130–142 Budikova P, Batko M, Zezula P (2011) Evaluation Platform for Content-based Image Retrieval Systems. In: International Conference on Theory and Practice of Digital Libraries, LNCS. Springer Berlin, Heidelberg, pp 130–142
9.
Zurück zum Zitat Budikova P, Batko M, Zezula P (2012) Query language for complex similarity queries. In: Advances in Databases and Information Systems, LNCS. Springer Berlin , Heidelberg, pp 85–98CrossRef Budikova P, Batko M, Zezula P (2012) Query language for complex similarity queries. In: Advances in Databases and Information Systems, LNCS. Springer Berlin , Heidelberg, pp 85–98CrossRef
10.
Zurück zum Zitat Chávez E, Figueroa K, Navarro G (2008) Effective Proximity Retrieval by Ordering Permutations. IEEE Trans Pattern Anal Mach Intell 30(9):1647–1658CrossRef Chávez E, Figueroa K, Navarro G (2008) Effective Proximity Retrieval by Ordering Permutations. IEEE Trans Pattern Anal Mach Intell 30(9):1647–1658CrossRef
11.
Zurück zum Zitat DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon Highly Available Key-value Store. ACM SIGOPS Oper Syst Rev 41(6):205–220CrossRef DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon Highly Available Key-value Store. ACM SIGOPS Oper Syst Rev 41(6):205–220CrossRef
12.
Zurück zum Zitat Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
13.
Zurück zum Zitat Esuli A (2012) Use of permutation prefixes for efficient and scalable approximate similarity search. Inf Process Manag 48(5):889–902CrossRef Esuli A (2012) Use of permutation prefixes for efficient and scalable approximate similarity search. Inf Process Manag 48(5):889–902CrossRef
14.
Zurück zum Zitat Gil-Costa V, Marin M (2011) Approximate Distributed Metric-Space Search. In: Proceedings of LSDS-IR ’11, Glasgow, UK, October 28. ACM Press, New York, pp 15–20 Gil-Costa V, Marin M (2011) Approximate Distributed Metric-Space Search. In: Proceedings of LSDS-IR ’11, Glasgow, UK, October 28. ACM Press, New York, pp 15–20
15.
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv:1408.5093 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv:1408.​5093
16.
Zurück zum Zitat Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees. In: Proceedings of STOC ’97. ACM Press, New York, pp 654–663 Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees. In: Proceedings of STOC ’97. ACM Press, New York, pp 654–663
17.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst:1106–1114 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst:1106–1114
18.
Zurück zum Zitat Lu W, Shen Y, Chen S, Ooi B (2012) Efficient processing of k nearest neighbor joins using mapreduce. Proceedings of the VLDB Endowment:1016–1027 Lu W, Shen Y, Chen S, Ooi B (2012) Efficient processing of k nearest neighbor joins using mapreduce. Proceedings of the VLDB Endowment:1016–1027
19.
Zurück zum Zitat Malkov Y, Ponomarenko A, Logvinov A, Krylov V (2012) Scalable Distributed Algorithm for Approximate Nearest Neighbor Search Problem in High Dimensional General Metric Spaces. In: Similarity Search and Applications, Lecture Notes in Computer Science, vol 7404. Springer Berlin, Heidelberg, pp 132–147 Malkov Y, Ponomarenko A, Logvinov A, Krylov V (2012) Scalable Distributed Algorithm for Approximate Nearest Neighbor Search Problem in High Dimensional General Metric Spaces. In: Similarity Search and Applications, Lecture Notes in Computer Science, vol 7404. Springer Berlin, Heidelberg, pp 132–147
20.
Zurück zum Zitat Moise D, Shestakov D, Gudmundsson G, Amsaleg L (2013) Terabyte-scale Image Similarity Search: Experience and Best Practice. In: 2013 IEEE International Conference on Big Data, pp. 674–682 Moise D, Shestakov D, Gudmundsson G, Amsaleg L (2013) Terabyte-scale Image Similarity Search: Experience and Best Practice. In: 2013 IEEE International Conference on Big Data, pp. 674–682
21.
Zurück zum Zitat MPEG-7 (2002) Multimedia content description interfaces. Part 3: Visual. ISO/IEC 2002:15938–3 MPEG-7 (2002) Multimedia content description interfaces. Part 3: Visual. ISO/IEC 2002:15938–3
22.
Zurück zum Zitat Novak D, Batko M, Zezula P (2011) Metric Index: An Efficient and Scalable Solution for Precise and Approximate Similarity Search. Inf Syst 36(4):721–733CrossRef Novak D, Batko M, Zezula P (2011) Metric Index: An Efficient and Scalable Solution for Precise and Approximate Similarity Search. Inf Syst 36(4):721–733CrossRef
23.
Zurück zum Zitat Novak D, Batko M, Zezula P (2012) Large-scale similarity data management with distributed Metric Index. Inf Process Manag 48(5):855–872CrossRef Novak D, Batko M, Zezula P (2012) Large-scale similarity data management with distributed Metric Index. Inf Process Manag 48(5):855–872CrossRef
24.
Zurück zum Zitat Novak D, Zezula P (2006) M-Chord: A Scalable Distributed Similarity Search Structure. In: Proceedings of InfoScale ’06. ACM Press, New York, pp 1–10 Novak D, Zezula P (2006) M-Chord: A Scalable Distributed Similarity Search Structure. In: Proceedings of InfoScale ’06. ACM Press, New York, pp 1–10
25.
Zurück zum Zitat Novak D, Zezula P (2014) Rank Aggregation of Candidate Sets for Efficient Similarity Search. In: Database and Expert Systems Applications: 25th International Conference, DEXA 2014. Proceedings, Part II, LNCS, vol 8645. Springer, pp 42–58 Novak D, Zezula P (2014) Rank Aggregation of Candidate Sets for Efficient Similarity Search. In: Database and Expert Systems Applications: 25th International Conference, DEXA 2014. Proceedings, Part II, LNCS, vol 8645. Springer, pp 42–58
27.
Zurück zum Zitat Silva YN, Pearson SS, Cheney JA (2013) Database Similarity Join for Metric Spaces. In: Similarity Search and Applications, pp. 266–279 Silva YN, Pearson SS, Cheney JA (2013) Database Similarity Join for Metric Spaces. In: Similarity Search and Applications, pp. 266–279
28.
Zurück zum Zitat Silva YN, Reed JM (2012) Exploiting MapReduce-based similarity joins. In: Proceedings of SIGMOD ’12. ACM Press, New York, p 693 Silva YN, Reed JM (2012) Exploiting MapReduce-based similarity joins. In: Proceedings of SIGMOD ’12. ACM Press, New York, p 693
29.
Zurück zum Zitat Wan J, Wang D, Hoi S, Wu P, Zhu J, Zhang Y, Li J (2014) Deep Learning for Content-Based Image Retrieval: A Comprehensive Study. In: Proceedings of 22nd ACM International Conference on Multimedia Wan J, Wang D, Hoi S, Wu P, Zhu J, Zhang Y, Li J (2014) Deep Learning for Content-Based Image Retrieval: A Comprehensive Study. In: Proceedings of 22nd ACM International Conference on Multimedia
30.
Zurück zum Zitat Zezula P, Amato G, Dohnal V, Batko M (2006) Similarity Search: The Metric Space Approach, Advances in Database Systems, vol 32. Springer Zezula P, Amato G, Dohnal V, Batko M (2006) Similarity Search: The Metric Space Approach, Advances in Database Systems, vol 32. Springer
31.
Zurück zum Zitat Zezula P, Savino P, Amato G, Rabitti F (1998) Approximate similarity retrieval with M-Trees. VLDB J 7(4):275–293CrossRef Zezula P, Savino P, Amato G, Rabitti F (1998) Approximate similarity retrieval with M-Trees. VLDB J 7(4):275–293CrossRef
Metadaten
Titel
Multi-modal Similarity Retrieval with Distributed Key-value Store
verfasst von
David Novak
Publikationsdatum
01.08.2015
Verlag
Springer US
Erschienen in
Mobile Networks and Applications / Ausgabe 4/2015
Print ISSN: 1383-469X
Elektronische ISSN: 1572-8153
DOI
https://doi.org/10.1007/s11036-014-0561-4

Weitere Artikel der Ausgabe 4/2015

Mobile Networks and Applications 4/2015 Zur Ausgabe

Neuer Inhalt