nach oben

Neural Computing and Applications

Erschienen in:

01.02.2016 | Original Article

Joint learning of cross-modal classifier and factor analysis for multimedia data classification

verfasst von: Kanghong Duan, Hongxin Zhang, Jim Jing-Yan Wang

Erschienen in: Neural Computing and Applications | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we study the problem of learning from multiple model data for the purpose of document classification. In this problem, each document is composed of two different models of data, i.e., an image and a text. We propose to represent the data of two models by projecting them to a shared data space by using cross-model factor analysis formula and classify them in the shared space by using a linear class label predictor, named cross-model classifier. The parameters of both cross-model classifier and cross-model factor analysis are learned jointly, so that they can regularize the learning of each other. We construct a unified objective function for this learning problem. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projections measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple model document data sets show the advantage of the proposed algorithm over state-of-the-art multimedia data classification methods.

Vorheriger Artikel The stabilization of BAM neural networks with time-varying delays in the leakage terms via sampled-data control

Nächster Artikel An adaptive watermarking approach based on weighted quantum particle swarm optimization

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: 16th Annual neural information processing systems conference (NIPS 2002), pp 561–568

Berghöfer E, Schulze D, Rauch C, Tscherepanow M, Khler T, Wachsmuth S (2013) Art-based fusion of multi-modal perception for robots. Neurocomputing 107:11–22CrossRef

Caicedo J, BenAbdallah J, González F, Nasraoui O (2012) Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization. Neurocomputing 76(1):50–60CrossRef

Carenzi F, Bendahan P, Roschin V, Frolov A, Gorce P, Maier M (2004) A generic neural network for multi-modal sensorimotor learning. Neurocomputing 58–60:525–533CrossRef

Chen Y, Wang L, Wang W, Zhang Z (2012) Continuum regression for cross-modal multimedia retrieval. In: 2012 19th IEEE international conference on image processing (ICIP 2012), pp 1949–1952

Costa Pereira J, Coviello E, Doyle G, Rasiwasia N, Lanckriet G, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef

Deng J, Du L, Shen YD (2013) Heterogeneous metric learning for cross-modal multimedia retrieval. In: Web information systems engineering—WISE 2013. 14th International conference. proceedings: LNCS 8180, vol pt.I, pp 43–56

Fomeni F, Letchford A (2014) A dynamic programming heuristic for the quadratic knapsack problem. INFORMS J Comput 26(1):173–182MathSciNetCrossRef

Hong C, Zhu J (2013) Hypergraph-based multi-example ranking with sparse representation for transductive learning image retrieval. Neurocomputing 101:94–103CrossRef

10.

Hu Y, Mian AS, Owens R (2011) Sparse approximated nearest points for image set classification. In: Computer vision and pattern recognition (CVPR), 2011 IEEE conference on, pp 121–128

11.

Jayasekara S, Dassanayake H, Fernando A (2013) A novel image retrieval system based on histogram factorization and contextual similarity learning. Appl Mech Mater 380:4148–4151CrossRef

12.

Khan I, Saffari A, Bischof H (2009) Tvgraz: Multi-modal learning of object categories by combining textual and visual features. In: AAPR Workshop, pp 213–224

13.

Kim HJ, Kim JU, Ra YG (2005) Boosting naïve bayes text classification using uncertainty-based selective sampling. Neurocomputing 67(1–4 SUPPL.):403–410CrossRef

14.

Lee KS, Nurzid Rosli A, Ariesthea Supandi I, Jo GS (2014) Dynamic sampling-based interpolation algorithm for representation of clickable moving object in collaborative video annotation. Neurocomputing 146:291–300CrossRef

15.

Li D, Dimitrova N, Li M, Sethi IK (2003) Multimedia content processing through cross-modal association. In: Proceedings of the eleventh ACM international conference on Multimedia, pp 604–611

16.

Liu F, Yang G, Yin Y, Wang S (2014) Singular value decomposition based minutiae matching method for finger vein recognition. Neurocomputing 145:75–89CrossRef

17.

Liu H, Li S (2013) Decision fusion of sparse representation and support vector machine for sar image target recognition. Neurocomputing 113:97–104CrossRef

18.

Lumini A, Nanni L (2006) An advanced multi-modal method for human authentication featuring biometrics data and tokenised random numbers. Neurocomputing 69(13–15):1706–1710CrossRef

19.

Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: 11th Annual conference on neural information processing systems (NIPS 1997), pp 570–576

20.

Masci J, Bronstein M, Bronstein A, Schmidhuber J (2014) Multimodal similarity-preserving hashing. IEEE Trans Pattern Anal Mach Intell 36(4):824–830CrossRef

21.

Merkl D (1998) Text classification with self-organizing maps: Some lessons learned. Neurocomputing 21(1–3):61–77CrossRef

22.

Miao P, Shen Y, Xia X (2014) Finite time dual neural networks with a tunable activation function for solving quadratic programming problems and its application. Neurocomputing 143:80–89CrossRef

23.

Oh K, Oh BS, Toh KA, Yau WY, Eng HL (2014) Combining sclera and periocular features for multi-modal identity verification. Neurocomputing 128:185–198CrossRef

24.

Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the international conference on Multimedia, ACM, pp 251–260

25.

Szymczyk P, Szymczyk M (2015) Classification of geological structure using ground penetrating radar and laplace transform artificial neural networks. Neurocomputing 148:354–362CrossRef

26.

Vidar EA, Alvindia SK (2013) SVD based graph regularized matrix factorization. In: Intelligent Data Engineering and Automated Learning-IDEAL 2013, Springer, pp 234–241

27.

Wang D, Wu J, Zhang H, Xu K, Lin M (2013) Towards enhancing centroid classifier for text classification-a border-instance approach. Neurocomputing 101:299–308CrossRef

28.

Wang J, Li Y, Zhang Y, Xie H, Wang C (2011) Bag-of-features based classification of breast parenchymal tissue in the mammogram via jointly selecting and weighting visual words. In: Image and Graphics (ICIG), 2011 Sixth International Conference on IEEE, pp 622–627

29.

Wang R, Guo H, Davis LS, Dai Q (2012) Covariance discriminative learning: a natural and efficient approach to image set classification. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on IEEE, pp 2496–2503

30.

Wang Y, Guan L, Venetsanopoulos AN (2011) Kernel cross-modal factor analysis for multimodal information fusion. In: Acoustics, speech and signal processing (ICASSP), 2011 IEEE international conference on IEEE, pp 2384–2387

31.

Xing B, Zhang K, Sun S, Zhang L, Gao Z, Wang J, Chen S (2015) Emotion-driven chinese folk music-image retrieval based on de-svm. Neurocomputing 148:619–627CrossRef

32.

Yu J, Cong Y, Qin Z, Wan T (2012) Cross-modal topic correlations for multimedia retrieval. In: 2012 21st international conference on pattern recognition (ICPR 2012), pp 246–249

33.

Zhang H, Lv S, Li W, Qu X (2014) A novel face recognition method using nearest line projection. J Comput 9(8):1952–1958

34.

Zhang X, Xu Z, Jia N, Yang W, Feng Q, Chen W, Feng Y (2015) Denoising of 3d magnetic resonance images by using higher-order singular value decomposition. Med Image Anal 19(1):75–86CrossRef

Titel: Joint learning of cross-modal classifier and factor analysis for multimedia data classification
verfasst von: Kanghong Duan
Hongxin Zhang
Jim Jing-Yan Wang
Publikationsdatum: 01.02.2016
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 2/2016
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-015-1866-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 2/2016

Adaptive trajectory tracking neural network control with robust compensator for robot manipulators

Manifold regularized extreme learning machine

Intrusive tumor growth inspired optimization algorithm for data clustering

Main objects interaction activity recognition in real images

An adaptive watermarking approach based on weighted quantum particle swarm optimization

An optimal method for data clustering