nach oben

International Journal of Multimedia Information Retrieval

Erschienen in:

01.06.2015 | Regular Paper

Distributed cross-media multiple binary subspace learning

verfasst von: Xueyi Zhao, Chenyi Zhang, Zhongfei Zhang

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Due to the ubiquitous existence of large-scale data in today’s real-world applications, including learning on cross-media data, we propose a semi-supervised learning method, named Multiple Binary Subspace Regression, for cross-media data concept detection. In order to mine the common features among the data with multiple modalities, we project the original cross-media data onto the same subspace-level representation simultaneously by mapping to the corresponding subspaces for dimensionality reduction. All the subspaces are set to be binary, which only involve the addition operations and omit the multiplication operations in the subsequent computation owing to the good property of the binary values. The dimensionality reduction to a binary subspace and the concept detection on this subspace are also optimized simultaneously leading to a semi-supervised model. For dealing with large-scale data, our learning method is easily implemented to run in a MapReduce-based Hadoop system. Empirical studies demonstrate its competitive performance on convergence, efficiency, and scalability in comparison with the state-of-the-art literature.

Vorheriger Artikel Large image modality labeling initiative using semi-supervised and optimized clustering

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm.

http://www.svcl.ucsd.edu/projects/crossmodal/.

Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of Springer computational statistics, pp 177–186

Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. J Found Trends Mach Learn 3(1):1–122CrossRef

Breiman L (1996) Bagging predictors. J Mach Learn 24(2):123–140MATHMathSciNet

Chakrabarti D, Agarwal D, Josifovski V (2008) Contextual advertising by combining relevance with click feedback. In: Proceedings of ACM international conference World Wide Web, pp 417–426

Chang KW, Hsieh CJ, Lin CJ (2008) Coordinate descent method for large-scale l2-loss linear support vector machines. J Mach Learn Res 9:1369–1398MATHMathSciNet

Chapelle O, Schölkopf B, Zien A (2006) A discussion of semi-supervised learning and transduction. MIT Press, New York

Chen N, Zhu J, Xing EP (2010) Predictive subspace learning for multi-view data: a large margin approach. In: Proceedings of neural information processing systems, pp 361–369

Chu C, Kim S, Lin Y, Yu Y, Bradski G, Ng A, Olukotun K (2007) Map-reduce for machine learning on multicore. In: Proceedings of neural information processing systems, p 281

Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: Proceedings of operating systems design and implementation, pp 137–149

10.

Doan TN, Do TN, Poulet F (2014) Parallel incremental power mean svm for the classification of large-scale image datasets. Int J Multimed Info Retr 3(2):89–96CrossRef

11.

Gast E, Oerlemans A, Lew MS (2013) Very large scale nearest neighbor search: ideas, strategies and challenges. Int J Multimed Info Retr 2(4):229–241CrossRef

12.

Gemulla R, Nijkamp E, Haas PJ, Sismanis Y, Sismanis Y (2011) Large-scale matrix factorization with distributed stochastic gradient descent. In: Proceedings of ACM international conference on SIGKDD, pp 69–77

13.

Genkin A, Lewis DD, Madigan D (2007) Large-scale bayesian logistic regression for text categorization. J Tech 49(3):291–304CrossRefMathSciNet

14.

Hsieh CJ, Chang KW, Lin CJ, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear svm. In: Proceedings of ACM international conference machine learning, pp 408–415

15.

Hua W, He X (2011) Discriminative concept factorization for data representation. J Neurocomput 74(18):3800–3807CrossRef

16.

Jiang W, Loui AC (2012) Video concept detection by audio-visual grouplets. Int J Multimed Info Retr 1(4):223–238CrossRef

17.

Liu C, Yang H, Fan J, He LW, Wang YM, Wang YM (2010) Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In: Proceedings of ACM international conference on World Wide Web, pp 681–690

18.

Liu G, Lin Z, Yu Y (2010) Robust subspace segmentation by low-rank representation. In: Proceedings of international conference on machine learning, pp 663–670

19.

Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of ACM international conference on SIGKDD, pp 317–326

20.

Long B, Zhang ZM, Yu PS (2005) Co-clustering by block value decomposition. In: Proceedings of ACM internatinal conference on SIGKDD, pp 635–640

21.

Mateos G, Bazerque JA, Giannakis GB (2010) Distributed sparse linear regression. IEEE Trans Signal Proc 58(10):5262–5276CrossRefMathSciNet

22.

Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. J Ann Stat 34(3):1436–1462CrossRefMATH

23.

Quadrianto N, Lampert CH (2011) Learning multi-view neighborhood preserving projections. In: Proceedings of international conference on machine learning, pp 425–432

24.

Rahman MM, You D, Simpson MS, Antani SK, Demner-Fushman D, Thoma GR (2013) Multimodal biomedical image retrieval using hierarchical classification and modality fusion. Int J Multimed Info Retr 2(3):159–173CrossRef

25.

Romberg S, Lienhart R, Hörster E (2012) Multimodal image retrieval. Int J Multimed Info Retr 1(1):31–44CrossRef

26.

Seber GA, Lee AJ (2012) Linear regression analysis, vol 936. Wiley, New York

27.

Seung D, Lee L (2001) Algorithms for non-negative matrix factorization. In: Proceedings of neural information processing systems, pp 556–562

28.

Shalev-Shwartz S, Tewari A (2009) Stochastic methods for 11 regularized loss minimization. In: Proceedings of international conference on machine learning, pp 929–936

29.

Singh A, Gordon G (2008) A unified view of matrix factorization models. J Mach Learn Know Disc Datab 5212:358–373

30.

Ulges A, Borth D, Koch M (2013) Content analysis meets viewers: linking concept detection with demographics on youtube. Int J Multimed Info Retr 2(2):145–157CrossRef

31.

Weisberg S (2014) Applied linear regression. Wiley, New York

32.

Yu HF, Hsieh CJ, Chang KW, Lin CJ (2012) Large linear classification when data cannot fit in memory. ACM Trans Know Disc Data 5(4):23

33.

Yu HF, Hsieh CJ, Si S, Dhillon I (2012) Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Proceedings of IEEE international conference on data mining, pp 765–774

34.

Yuan Y, Li X, Pang Y, Lu X, Tao D (2009) Binary sparse nonnegative matrix factorization. IEEE Trans Circ Syst Video Tech 19(5):772–777

35.

Zhang Z, Zhuang Y, Jain R, Pan JY (2014) Editorial of the special issue on cross-media analysis. Int J Multimed Info Retr 3(3):129–130CrossRef

36.

Zhao X, Zhang C, Zhang Z (2014) Distributed binary subspace learning on large-scale cross media data. In: Proceedings of international conference on multimedia and expo, pp 1–6

37.

Zinkevich M, Weimer M, Smola A, Li L (2010) Parallelized stochastic gradient descent. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Proceedings of neural information processing systems, vol 23. Curran Associates, Inc., pp 2595–2603

Titel: Distributed cross-media multiple binary subspace learning
verfasst von: Xueyi Zhao
Chenyi Zhang
Zhongfei Zhang
Publikationsdatum: 01.06.2015
Verlag: Springer London
Erschienen in: International Journal of Multimedia Information Retrieval / Ausgabe 2/2015
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI: https://doi.org/10.1007/s13735-015-0081-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2015

Building effective SVM concept detectors from clickthrough data for large-scale image retrieval

Large image modality labeling initiative using semi-supervised and optimized clustering

Learning to detect concepts with Approximate Laplacian Eigenmaps in large-scale and online settings

ImageCLEF annotation with explicit context-aware kernel maps

On-the-fly learning for visual search of large-scale image and video datasets

Special issue on concept detection with big data