2015 | OriginalPaper | Chapter
Multiclass Boosting Framework for Multimodal Data Analysis
Authors : Shixun Wang, Peng Pan, Yansheng Lu, Sheng Jiang
Published in: MultiMedia Modeling
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
A large number of multimedia documents containing texts and images have appeared on the internet, hence cross-modal retrieval in which the modality of a query is different from that of the retrieved results is being an interesting search paradigm. In this paper, a multimodal multiclass boosting framework (MMB) is proposed to capture intra-modal semantic information and inter-modal semantic correlation. Unlike traditional boosting methods which are confined to two classes or single modality, MMB could simultaneously deal with multimodal data. The empirical risk, which takes both intra-modal and inter-modal losses into account, is designed and then minimized by gradient descent in the multidimensional functional spaces. More specifically, the optimization problem is solved in turn for each modality. Semantic space can be naturally attained by applying sigmoid function to the quasi-margins. Extensive experiments on the Wiki and NUS-WIDE datasets show that the performance of our method significantly outperforms those of existing approaches for cross-modal retrieval.