nach oben

Neural Computing and Applications

Erschienen in:

18.10.2023 | Original Article

Cross-media web video event mining based on multiple semantic-paths embedding

verfasst von: Xia Xiao, Mingyue Du, Shuyu Xu, Guoying Liu, Chengde Zhang

Erschienen in: Neural Computing and Applications | Ausgabe 2/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Web video event mining based on cross-media fusion has become a research hotspot. However, each video is only described by a dozen noisy words, resulting in extremely unstable textual features. Moreover, different people might describe the same video with completely different words. Thus, the semantic association between textual and visual information would be much sparse, which brings great challenges to web video event mining based on cross-media associations. To address this issue, this paper proposes a novel framework to enrich the associations between near duplicate keyframes (NDK) and terms based on multiple semantic-paths embedding. After data preprocessing, we build a heterogeneous information network to establish associations among NDKs, terms and videos. Then, semantic-path walk strategy is designed to generate meaningful semantic-node sequences for embedding. Next, an embedding fusion method is proposed to predict the distribution characteristics of each term in NDKs. Finally, multiple correspondence analysis is used to mine web video events. Experiments on web videos from YouTube show that our proposed method performs better than several state-of-the-art baseline models, with an average F1 score improvement of 19–50%.

Vorheriger Artikel DVC-Net: a new dual-view context-aware network for emotion recognition in the wild

Nächster Artikel Local features-based evidence glossary for generic recognition of handwritten characters

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://www.youtube.com/yt/press/statistics.html (2021)

Ngo CW, Zhao WL, Jiang YG (2006) Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In: Proceedings of the 14th ACM international conference on multimedia, pp 845–854

Zhang C, Wu X, Shyu M-L, Peng Q (2015) Integration of visual temporal information and textual distribution information for news web video event mining. IEEE Trans Hum Mach Syst 46(1):124–135CrossRef

Thyagharajan K, Kalaiarasi G (2021) A review on near-duplicate detection of images using computer vision techniques. Arch Comput Methods Eng 28(3):897–916MathSciNetCrossRef

Liao K, Lei H, Zheng Y, Lin G, Cao C (2018) IR feature embedded bof indexing method for near-duplicate video retrieval. EEE Trans Circuits Syst Video Technol 29:3743–3753CrossRef

Luan X, Xie Y, Guo Y, He J, Zhang L, Zhang X (2017) A fast near-duplicate keyframe detection method based on local features. In: 2017 IEEE 17th international conference on communication technology (ICCT). IEEE, pp 1544–1547

Loslever P, Popieul J, Simon P, Todoskoff A (2010) Using multiple correspondence analysis for large driving signals database exploration example with lane narrowing and curves. In: 2010 IEEE intelligent vehicles symposium, pp 1184–1189

Chen K-Y, Luesukprasert L, Chou ST (2007) Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Trans Knowl Data Eng 19(8):1016–1025CrossRef

Zhang C, Wu X, Shyu M-L, Peng Q (2016) Integration of visual temporal information and textual distribution information for news web video event mining. IEEE Trans Hum Mach Syst 46(1):124–135CrossRef

10.

Liu H, Chen Z, Tang J, Zhou Y, Liu S (2020) Mapping the technology evolution path: a novel model for dynamic topic detection and tracking. Scientometrics 125(3):2043–2090CrossRef

11.

Zhang C, Liu D, Wu X, Zhao G, Shyu M-L, Peng Q (2016) Near-duplicate segments based news web video event mining. Signal Process 120:26–35CrossRef

12.

Asgari-Chenaghlu M, Feizi-Derakhshi M-R, Farzinvash L, Balafar M-A, Motamed C (2021) Topic detection and tracking techniques on twitter: a systematic review. Complexity 2021:1–15CrossRef

13.

Li Z, Tang J, Wang X, Liu J, Lu H (2016) Multimedia news summarization in search. ACM Trans Intell Syst Technol 7(3):1–20

14.

Yu J, Xie L, Xiao X, Chng ES (2018) Learning distributed sentence representations for story segmentation. Signal Process 142:403–411CrossRef

15.

Liu T, Xue F, Sun J, Sun X (2020) A survey of event analysis and mining from social multimedia. Multimed Tools Appl 79:33431–33448CrossRef

16.

Zhang C, Jin D, Xiao X, Chen G, Shyu M-L (2020) A novel collaborative optimization framework for web video event mining based on the combination of inaccurate visual similarity detection information and sparse textual information. IEEE Access 8:10516–10527CrossRef

17.

Li Z, Tang J, Mei T (2019) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 41(9):2070–2083CrossRef

18.

Li Z, Tang J (2016) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288MathSciNetCrossRef

19.

He Q, Chang K, Lim EP (2007) Analyzing feature trajectories for event detection. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp 207–214

20.

Hsu WH, Chang SF (2006) Topic tracking across broadcast news videos with visual duplicates and semantic concepts. In: 2006 international conference on image processing, pp 141–144

21.

Yao J, Cui B, Huang Y, Zhou Y (2012) Bursty event detection from collaborative tags. World Wide Web 15:171–195CrossRef

22.

Zeng Y, Cao D, Wei X, Liu M, Zhao Z, Qin Z (2021) Multi-modal relational graph for cross-modal video moment retrieval. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2215–2224

23.

Bian J, Yang Y, Zhang H, Chua T-S (2014) Multimedia summarization for social events in microblog stream. IEEE Trans Multimed 17(2):216–228CrossRef

24.

Bian J, Yang Y, Zhang H, Chua T-S (2015) Multimedia summarization for social events in microblog stream. IEEE Trans Multimed 17:216–228CrossRef

25.

Zhang C-D, Wu X, Shyu M-L, Peng Q (2013) A novel web video event mining framework with the integration of correlation and co-occurrence information. J Comput Sci Technol 28(5):788–796CrossRef

26.

Qi J, Huang X, Peng Y (2016) Cross-media retrieval by multimodal representation fusion with deep networks. International forum of digital TV and wireless multimedia communication. Springer, Berlin, pp 218–227

27.

Bian, T, Xiao X, Xu T, Zhao, P, Huang W, Rong Y, Huang J (2020) Rumor detection on social media with bi-directional graph convolutional networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 549–556

28.

Li P, Xu X (2020) Recurrent compressed convolutional networks for short video event detection. IEEE Access 8:114162–114171CrossRef

29.

Kim J, Hastak M (2018) Social network analysis: characteristics of online social networks after a disaster. Int J Inf Manag 38(1):86–96CrossRef

30.

Zhang J, Yang X, Hu X, Li T (2019) Author cooperation network in biology and chemistry literature during 2014–2018: construction and structural characteristics. Information 10(7):236CrossRef

31.

Mao G (2017) 5G green mobile communication networks. China Commun 14:183–184CrossRef

32.

Shi C, Li Y, Zhang J, Sun Y, Philip SY (2016) A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng 29(1):17–37CrossRef

33.

Shi C, Philip SY (2017) Heterogeneous information network analysis and applications. Springer, BerlinCrossRef

34.

Cao S, Lu W, Xu Q (2016) Deep neural networks for learning graph representations. In: AAAI, vol 16, pp 1145–1152

35.

Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining

36.

Zhang Y, Yang X, Wang L, Li K (2020) Wmpeclus: clustering via weighted meta-path embedding for heterogeneous information networks. In: 2020 IEEE 32nd international conference on tools with artificial intelligence (ICTAI). IEEE, pp 799–806

37.

Pham T, Tao X, Zhang J, Yong J (2020) Constructing a knowledge-based heterogeneous information graph for medical health status classification. Health Inf Sci Syst 8:1–14CrossRef

38.

Huang Z, Zheng Y, Cheng R, Sun Y, Mamoulis N, Li X (2016) Meta structure: computing relevance in large heterogeneous information networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1595–1604

39.

Zhao X, Xue J, Yu J, Li X, Yang H (2020) A multi-semantic metapath model for large scale heterogeneous network representation learning. CoRRarXiv: abs/2007.11380

40.

Xiao X, Jin B, Zhang C (2022) Personalized paper recommendation for postgraduates using multi-semantic path fusion. Appl Intell 53(8):9634–9649CrossRef

41.

Ai W, Wang Z, Shao H, Meng T, Li K (2023) A multi-semantic passing framework for semi-supervised long text classification. Appl Intell 1–17

42.

Yang Y, Pouyanfar S, Tian H, Chen M, Chen S-C, Shyu M-L (2017) If-mca: importance factor-based multiple correspondence analysis for multimedia data analytics. IEEE Trans Multimed 20(4):1024–1032CrossRef

43.

He L, Xu X, Lu H, Yang Y, ShenF, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: 2017 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1153–1158

44.

Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever, I (2021) Learning transferable visual models from natural language supervision

45.

LoweDavid G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110CrossRef

46.

Zhao W, Wu X, Ngo C-W (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimed 12:448–461CrossRef

47.

WEKA. http://www.cs.waikato.ac.nz/ml/weka/

48.

Shi C, Hu B, Zhao WX, Philip SY (2018) Heterogeneous information network embedding for recommendation. IEEE Trans Knowl Data Eng 31(2):357–370CrossRef

49.

Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining

50.

Shi C, Hu B, Zhao WX, Yu PS (2017) Heterogeneous information network embedding for recommendation. CoRRarXiv:abs/1711.10730

51.

Sun Y, Han J, Yan X, Yu PS, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. Proc VLDB Endow 4(11):992–1003CrossRef

52.

Yu J, Gao M, Li J, Yin H, Liu H (2018) Adaptive implicit friends identification over heterogeneous network for social recommendation. In: Proceedings of the 27th ACM international conference on information and knowledge management

53.

Li M, Tei K, Fukazawa Y(2020) Heterogeneous information network based adaptive social influence learning for recommendation and explanation. In: 2020 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology (WI-IAT), pp 137–144

54.

Comito C, Forestiero A, Pizzuti C (2019) Bursty event detection in twitter streams. ACM Trans Knowl Disc Data (TKDD) 13:1–28CrossRef

Titel: Cross-media web video event mining based on multiple semantic-paths embedding
verfasst von: Xia Xiao
Mingyue Du
Shuyu Xu
Guoying Liu
Chengde Zhang
Publikationsdatum: 18.10.2023
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 2/2024
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-023-09050-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 2/2024

Consider high-order consistency for multi-view clustering

Action recognition based on adaptive region perception

Efficient lightweight network for video super-resolution

Correction to: Neural networks as building blocks for the design of efficient learned indexes

Capsule network-based disease classification for Vitis Vinifera leaves

Re-evaluation of machine learning models for predicting ultimate bearing capacity of piles through SHAP and Joint Shapley methods

Premium Partner