Skip to main content
Erschienen in: World Wide Web 3/2019

03.05.2018

Multimodal deep representation learning for video classification

verfasst von: Haiman Tian, Yudong Tao, Samira Pouyanfar, Shu-Ching Chen, Mei-Ling Shyu

Erschienen in: World Wide Web | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Real-world applications usually encounter data with various modalities, each containing valuable information. To enhance these applications, it is essential to effectively analyze all information extracted from different data modalities, while most existing learning models ignore some data types and only focus on a single modality. This paper presents a new multimodal deep learning framework for event detection from videos by leveraging recent advances in deep neural networks. First, several deep learning models are utilized to extract useful information from multiple modalities. Among these are pre-trained Convolutional Neural Networks (CNNs) for visual and audio feature extraction and a word embedding model for textual analysis. Then, a novel fusion technique is proposed that integrates different data representations in two levels, namely frame-level and video-level. Different from the existing multimodal learning algorithms, the proposed framework can reason about a missing data type using other available data modalities. The proposed framework is applied to a new video dataset containing natural disaster classes. The experimental results illustrate the effectiveness of the proposed framework compared to some single modal deep learning models as well as conventional fusion techniques. Specifically, the final accuracy is improved more than 16% and 7% compared to the best results from single modality and fusion models, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimed.Syst. 16(6), 345–379 (2010)CrossRef Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimed.Syst. 16(6), 345–379 (2010)CrossRef
2.
Zurück zum Zitat Chen, S.C., Shyu, M.L., Kashyap, R.L.: Augmented transition network as a semantic model for video data. Int. J. Network. Inf. Syst. Special Issue Video Data 3 (1), 9–25 (2000) Chen, S.C., Shyu, M.L., Kashyap, R.L.: Augmented transition network as a semantic model for video data. Int. J. Network. Inf. Syst. Special Issue Video Data 3 (1), 9–25 (2000)
3.
Zurück zum Zitat Chen, S.C., Shyu, M.L., Chen, M., Zhang, C.: A decision tree-based multimodal data mining framework for soccer goal detection. In: IEEE International conference on multimedia and expo, pp. 265–268 (2004) Chen, S.C., Shyu, M.L., Chen, M., Zhang, C.: A decision tree-based multimodal data mining framework for soccer goal detection. In: IEEE International conference on multimedia and expo, pp. 265–268 (2004)
4.
Zurück zum Zitat Chen, S.C., Shyu, M.L., Zhang, C.: Innovative shot boundary detection for video indexing. Video Data Manag. Inf. Retriev, 217–236 (2005) Chen, S.C., Shyu, M.L., Zhang, C.: Innovative shot boundary detection for video indexing. Video Data Manag. Inf. Retriev, 217–236 (2005)
5.
Zurück zum Zitat Chen, M., Chen, S.C., Shyu, M.L., Zhang, C.: Video event mining via multimodal content analysis and classification. In: Petrushin, V. A., Khan, L. (eds.) Multimedia Data Mining and Knowledge Discovery, pp. 234–258. Springer, London (2007) Chen, M., Chen, S.C., Shyu, M.L., Zhang, C.: Video event mining via multimodal content analysis and classification. In: Petrushin, V. A., Khan, L. (eds.) Multimedia Data Mining and Knowledge Discovery, pp. 234–258. Springer, London (2007)
6.
Zurück zum Zitat Chen, X., Zhang, C., Chen, S.C., Rubin, S.: A human-centered multiple instance learning framework for semantic video retrieval. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 39(2), 228–233 (2009)CrossRef Chen, X., Zhang, C., Chen, S.C., Rubin, S.: A human-centered multiple instance learning framework for semantic video retrieval. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 39(2), 228–233 (2009)CrossRef
7.
Zurück zum Zitat Chen, C., Zhu, Q., Lin, L., Shyu, M.L.: Web media semantic concept retrieval via tag removal and model fusion. ACM Trans. Intell. Syst. Technol. 4(4), 61 (2013) Chen, C., Zhu, Q., Lin, L., Shyu, M.L.: Web media semantic concept retrieval via tag removal and model fusion. ACM Trans. Intell. Syst. Technol. 4(4), 61 (2013)
8.
Zurück zum Zitat Deng, L., Yu, D., et al.: Deep learning: Methods and applications. Foundations and Trends®;, in Signal Processing 7(3–4), 197–387 (2014)MathSciNetCrossRefMATH Deng, L., Yu, D., et al.: Deep learning: Methods and applications. Foundations and Trends®;, in Signal Processing 7(3–4), 197–387 (2014)MathSciNetCrossRefMATH
9.
Zurück zum Zitat Fleury, A., Vacher, M., Noury, N: SVM-based multimodal classification of activities of daily living in health smart homes: sensors, algorithms, and first experimental results. IEEE Trans. Inf. Technol. Biomed. 14(2), 274–283 (2010)CrossRef Fleury, A., Vacher, M., Noury, N: SVM-based multimodal classification of activities of daily living in health smart homes: sensors, algorithms, and first experimental results. IEEE Trans. Inf. Technol. Biomed. 14(2), 274–283 (2010)CrossRef
10.
Zurück zum Zitat Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)CrossRef Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)CrossRef
11.
Zurück zum Zitat Ha, H.Y., Yang, Y., Pouyanfar, S., Tian, H., Chen, S.C.: Correlation-based deep learning for multimedia semantic concept detection. In: International Conference on Web Information Systems Engineering, pp. 473–487 (2015) Ha, H.Y., Yang, Y., Pouyanfar, S., Tian, H., Chen, S.C.: Correlation-based deep learning for multimedia semantic concept detection. In: International Conference on Web Information Systems Engineering, pp. 473–487 (2015)
12.
Zurück zum Zitat Hannun, A.Y., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep speech: scaling up end-to-end speech recognition. CoRR arXiv:1412.5567 (2014) Hannun, A.Y., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep speech: scaling up end-to-end speech recognition. CoRR arXiv:1412.​5567 (2014)
13.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
14.
Zurück zum Zitat Johnson, R., Zhang, T.: Supervised and semi-supervised text categorization using lstm for region embeddings. In: International Conference on Machine Learning, pp. 526–534. JMLR.org (2016) Johnson, R., Zhang, T.: Supervised and semi-supervised text categorization using lstm for region embeddings. In: International Conference on Machine Learning, pp. 526–534. JMLR.org (2016)
15.
Zurück zum Zitat Kahou, S.E., Bouthillier, X., Lamblin, P., Gulcehre, C., Michalski, V., Konda, K., Jean, S., Froumenty, P., Dauphin, Y., Boulanger-Lewandowski, N., et al.: Emonets: multimodal deep learning approaches for emotion recognition in video. J. Multimodal User Interf. 10(2), 99–111 (2016)CrossRef Kahou, S.E., Bouthillier, X., Lamblin, P., Gulcehre, C., Michalski, V., Konda, K., Jean, S., Froumenty, P., Dauphin, Y., Boulanger-Lewandowski, N., et al.: Emonets: multimodal deep learning approaches for emotion recognition in video. J. Multimodal User Interf. 10(2), 99–111 (2016)CrossRef
16.
Zurück zum Zitat Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
17.
Zurück zum Zitat Kim, Y.: Convolutional neural networks for sentence classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL (2014) Kim, Y.: Convolutional neural networks for sentence classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL (2014)
18.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
19.
Zurück zum Zitat Lan, Z., Bao, L., Yu, S., Liu, W., Hauptmann, A.G.: Multimedia classification and event detection using double fusion. Multimed. Tools Appl. 71(1), 333–347 (2014)CrossRef Lan, Z., Bao, L., Yu, S., Liu, W., Hauptmann, A.G.: Multimedia classification and event detection using double fusion. Multimed. Tools Appl. 71(1), 333–347 (2014)CrossRef
20.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
21.
Zurück zum Zitat Li, T., Xie, N., Zeng, C., Zhou, W., Zheng, L., Jiang, Y., Yang, Y., Ha, H., Xue, W., Huang, Y., Chen, S., Navlakha, J. K., Iyengar, S. S.: Data-driven techniques in disaster information management. ACM Comput. Surv. 50 (1), 1 (2017)CrossRef Li, T., Xie, N., Zeng, C., Zhou, W., Zheng, L., Jiang, Y., Yang, Y., Ha, H., Xue, W., Huang, Y., Chen, S., Navlakha, J. K., Iyengar, S. S.: Data-driven techniques in disaster information management. ACM Comput. Surv. 50 (1), 1 (2017)CrossRef
22.
Zurück zum Zitat Lin, L., Shyu, M.L.: Weighted association rule mining for video semantic detection. Methods Innov. Multimed. Database Content Manag. 1(1), 37–54 (2012) Lin, L., Shyu, M.L.: Weighted association rule mining for video semantic detection. Methods Innov. Multimed. Database Content Manag. 1(1), 37–54 (2012)
23.
Zurück zum Zitat Meng, T., Shyu, M.L.: Leveraging concept association network for multimedia rare concept mining and retrieval. In: IEEE International Conference on Multimedia and Expo, pp. 860–865 (2012) Meng, T., Shyu, M.L.: Leveraging concept association network for multimedia rare concept mining and retrieval. In: IEEE International Conference on Multimedia and Expo, pp. 860–865 (2012)
24.
Zurück zum Zitat Morency, L.P., Mihalcea, R., Doshi, P.: Towards multimodal sentiment analysis: harvesting opinions from the Web. In: International Conference on Multimodal Interfaces, pp. 169–176. ACM (2011) Morency, L.P., Mihalcea, R., Doshi, P.: Towards multimodal sentiment analysis: harvesting opinions from the Web. In: International Conference on Multimodal Interfaces, pp. 169–176. ACM (2011)
25.
Zurück zum Zitat Mostafa, M.M.: More than words: social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 40(10), 4241–4251 (2013)CrossRef Mostafa, M.M.: More than words: social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 40(10), 4241–4251 (2013)CrossRef
26.
Zurück zum Zitat Pantic, M., Sebe, N., Cohn, J.F., Huang, T.: Affective multimodal human-computer interaction. In: ACM International Conference on Multimedia, pp. 669–676. ACM (2005) Pantic, M., Sebe, N., Cohn, J.F., Huang, T.: Affective multimodal human-computer interaction. In: ACM International Conference on Multimedia, pp. 669–676. ACM (2005)
27.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014) Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014)
28.
Zurück zum Zitat Potharaju, R., Carbunar, B., Azimpourkivi, M., Vasudevan, V., Iyengar, S.: Infiltrating social network accounts: attacks and defenses. In: Chang, C. H., Potkonjak, M. (eds.) Secure System Design and Trustable Computing, pp. 457–485. Springer, Cham (2016) Potharaju, R., Carbunar, B., Azimpourkivi, M., Vasudevan, V., Iyengar, S.: Infiltrating social network accounts: attacks and defenses. In: Chang, C. H., Potkonjak, M. (eds.) Secure System Design and Trustable Computing, pp. 457–485. Springer, Cham (2016)
29.
Zurück zum Zitat Pouyanfar, S., Chen, S.C.: Semantic concept detection using weighted discretization multiple correspondence analysis for disaster information management. In: International Conference on Information Reuse and Integration, pp. 556–564 (2016) Pouyanfar, S., Chen, S.C.: Semantic concept detection using weighted discretization multiple correspondence analysis for disaster information management. In: International Conference on Information Reuse and Integration, pp. 556–564 (2016)
30.
Zurück zum Zitat Pouyanfar, S., Chen, S.C.: Automatic video event detection for imbalance data using enhanced ensemble deep learning. Int. J. Semant. Comput. 11(01), 85–109 (2017)CrossRef Pouyanfar, S., Chen, S.C.: Automatic video event detection for imbalance data using enhanced ensemble deep learning. Int. J. Semant. Comput. 11(01), 85–109 (2017)CrossRef
31.
Zurück zum Zitat Pouyanfar, S., Yang, Y., Chen, S.C., Shyu, M.L., Iyengar, S.S.: Multimedia big data analytics: a survey. ACM Comput. Surv. 51(1), 10:1–10:34 (2018)CrossRef Pouyanfar, S., Yang, Y., Chen, S.C., Shyu, M.L., Iyengar, S.S.: Multimedia big data analytics: a survey. ACM Comput. Surv. 51(1), 10:1–10:34 (2018)CrossRef
32.
Zurück zum Zitat Reyes, M.E.P., Pouyanfar, S., Zheng, H.C., Ha, H.Y., Chen, S.C.: Multimedia data management for disaster situation awareness. In: International Symposium on Sensor Networks, Systems and Security. Springer (2017) Reyes, M.E.P., Pouyanfar, S., Zheng, H.C., Ha, H.Y., Chen, S.C.: Multimedia data management for disaster situation awareness. In: International Symposium on Sensor Networks, Systems and Security. Springer (2017)
33.
Zurück zum Zitat Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: International Workshop on Semantic Evaluation, pp. 502–518 (2017) Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: International Workshop on Semantic Evaluation, pp. 502–518 (2017)
34.
Zurück zum Zitat Scott, J.: Social network analysis. SAGE (2017) Scott, J.: Social network analysis. SAGE (2017)
35.
Zurück zum Zitat Shahbazi, H., Jamshidi, K., Monadjemi, A.H., Manoochehri, H.E.: Training oscillatory neural networks using natural gradient particle swarm optimization. Robotica 33(7), 1551–1567 (2015)CrossRef Shahbazi, H., Jamshidi, K., Monadjemi, A.H., Manoochehri, H.E.: Training oscillatory neural networks using natural gradient particle swarm optimization. Robotica 33(7), 1551–1567 (2015)CrossRef
36.
Zurück zum Zitat Shyu, M.L., Chen, S.C., Kashyap, R.L.: Generalized affinity-based association rule mining for multimedia database queries. Knowl. Inf. Syst. 3(3), 319–337 (2001)CrossRefMATH Shyu, M.L., Chen, S.C., Kashyap, R.L.: Generalized affinity-based association rule mining for multimedia database queries. Knowl. Inf. Syst. 3(3), 319–337 (2001)CrossRefMATH
37.
Zurück zum Zitat Shyu, M.L., Sarinnapakorn, K., Kuruppu-Appuhamilage, I., Chen, S.C., Chang, L., Goldring, T.: Handling nominal features in anomaly intrusion detection problems. In: International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, pp. 55–62 (2005) Shyu, M.L., Sarinnapakorn, K., Kuruppu-Appuhamilage, I., Chen, S.C., Chang, L., Goldring, T.: Handling nominal features in anomaly intrusion detection problems. In: International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, pp. 55–62 (2005)
38.
Zurück zum Zitat Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: AAAI Conference on Artificial Intelligence, pp. 4278–4284. AAAI Press (2017) Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: AAAI Conference on Artificial Intelligence, pp. 4278–4284. AAAI Press (2017)
39.
Zurück zum Zitat Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
40.
Zurück zum Zitat Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
41.
Zurück zum Zitat Takahashi, N., Gygli, M., Gool, L.V.: AENet: Learning deep audio features for video analysis. CoRR arXiv:1701.00599 (2017) Takahashi, N., Gygli, M., Gool, L.V.: AENet: Learning deep audio features for video analysis. CoRR arXiv:1701.​00599 (2017)
42.
Zurück zum Zitat Tian, H., Chen, S.C.: MCA-NN: Multiple correspondence analysis based neural network for disaster information detection. In: IEEE International Conference on Multimedia Big Data, pp. 268–275 (2017) Tian, H., Chen, S.C.: MCA-NN: Multiple correspondence analysis based neural network for disaster information detection. In: IEEE International Conference on Multimedia Big Data, pp. 268–275 (2017)
43.
Zurück zum Zitat Tian, H., Chen, S.C.: A video-aided semantic analytics system for disaster information integration. In: IEEE International Conference on Multimedia Big Data, pp. 242–243 (2017) Tian, H., Chen, S.C.: A video-aided semantic analytics system for disaster information integration. In: IEEE International Conference on Multimedia Big Data, pp. 242–243 (2017)
44.
Zurück zum Zitat Tian, Y., Chen, S.C., Shyu, M.L., Huang, T., Sheu, P., Del Bimbo, A.: Multimedia big data. IEEE MultiMedia 22(3), 93–95 (2015)CrossRef Tian, Y., Chen, S.C., Shyu, M.L., Huang, T., Sheu, P., Del Bimbo, A.: Multimedia big data. IEEE MultiMedia 22(3), 93–95 (2015)CrossRef
45.
Zurück zum Zitat Tian, H., Chen, S.C., Rubin, S.H., Grefe, W.K.: FA-MCADF: Feature affinity based multiple correspondence analysis and decision fusion framework for disaster information management. In: IEEE International Conference on Information Reuse and Integration, pp. 198–206 (2017) Tian, H., Chen, S.C., Rubin, S.H., Grefe, W.K.: FA-MCADF: Feature affinity based multiple correspondence analysis and decision fusion framework for disaster information management. In: IEEE International Conference on Information Reuse and Integration, pp. 198–206 (2017)
46.
Zurück zum Zitat Vosoughi, S., Vijayaraghavan, P., Roy, D.: Tweet2vec: learning tweet embeddings using character-level CNN-LSTM encoder-decoder. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1041–1044. ACM (2016) Vosoughi, S., Vijayaraghavan, P., Roy, D.: Tweet2vec: learning tweet embeddings using character-level CNN-LSTM encoder-decoder. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1041–1044. ACM (2016)
47.
Zurück zum Zitat Xue, H., Liu, Y., Cai, D., He, X.: Tracking people in rgbd videos using deep learning and motion clues. Neurocomputing 204, 70–76 (2016)CrossRef Xue, H., Liu, Y., Cai, D., He, X.: Tracking people in rgbd videos using deep learning and motion clues. Neurocomputing 204, 70–76 (2016)CrossRef
48.
Zurück zum Zitat Yan, Y., Zhu, Q., Shyu, M.L., Chen, S.C.: Classifier fusion by judgers on spark clusters for multimedia big data classification Qual. Softw. Through Reuse Integr., pp. 91–108. Springer, Cham (2016) Yan, Y., Zhu, Q., Shyu, M.L., Chen, S.C.: Classifier fusion by judgers on spark clusters for multimedia big data classification Qual. Softw. Through Reuse Integr., pp. 91–108. Springer, Cham (2016)
49.
Zurück zum Zitat Yang, Y., Lu, W., Domack, J., Li, T., Chen, S.C., Luis, S., Navlakha, J.K.: MADIS: A multimedia-aided disaster information integration system for emergency management. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 233–241. IEEE (2012) Yang, Y., Lu, W., Domack, J., Li, T., Chen, S.C., Luis, S., Navlakha, J.K.: MADIS: A multimedia-aided disaster information integration system for emergency management. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 233–241. IEEE (2012)
50.
Zurück zum Zitat Yang, Y., Pouyanfar, S., Tian, H., Chen, M., Chen, S.C., Shyu, M.L.: IF-MCA: Importance factor-based multiple correspondence analysis for multimedia data analytics. IEEE Transactions on Multimedia (2017) Yang, Y., Pouyanfar, S., Tian, H., Chen, M., Chen, S.C., Shyu, M.L.: IF-MCA: Importance factor-based multiple correspondence analysis for multimedia data analytics. IEEE Transactions on Multimedia (2017)
51.
Zurück zum Zitat Zhang, D., Wang, Y., Zhou, L., Yuan, H., Shen, D.: Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage 55(3), 856–867 (2011)CrossRef Zhang, D., Wang, Y., Zhou, L., Yuan, H., Shen, D.: Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage 55(3), 856–867 (2011)CrossRef
Metadaten
Titel
Multimodal deep representation learning for video classification
verfasst von
Haiman Tian
Yudong Tao
Samira Pouyanfar
Shu-Ching Chen
Mei-Ling Shyu
Publikationsdatum
03.05.2018
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 3/2019
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-018-0548-3

Weitere Artikel der Ausgabe 3/2019

World Wide Web 3/2019 Zur Ausgabe

Premium Partner