Article

Structure-sensitive manifold ranking for video concept detection

Authors:
Jinhui Tang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Xian-Sheng Hua

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Guo-Jun Qi

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Meng Wang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Tao Mei

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Xiuqing Wu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

MM '07: Proceedings of the 15th ACM international conference on MultimediaSeptember 2007Pages 852–861https://doi.org/10.1145/1291233.1291430

Published:29 September 2007Publication History

MM '07: Proceedings of the 15th ACM international conference on Multimedia

Pages 852–861

ABSTRACT

Pairwise similarity of samples is an essential factor in graph propagation based semi-supervised learning methods. Usually it is estimated based on Euclidean distance. However, the structural assumption, which is a basic assumption in these methods, has not been taken into consideration in the normal pairwise similarity measure. In this paper, we propose a novel graph-based learning approach, named Structure-Sensitive Manifold Ranking (SSMR),based on a structure-sensitive similarity measure. Instead of using distance only, SSMR takes local distribution differences into account to more accurately measure pairwise similarity. Furthermore, we show that SSMR can also be deduced from a partial differential equation based anisotropic diffusion. Experiments conducted on the TRECVID dataset show that this approach significantly outperforms existing graph-based semi-supervised learning methods for video semantic concept detection.

References

Guidelines for the trecvid 2005 evaluation. http://www-nlpir.nist.gov/projects/tv2005/tv2005.html.Google Scholar
Trec-10 proceedings appendix on common evaluation measures. http://trec.nist.gov/pubs/trec10/appendices/measures.pdf.Google Scholar
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, (7):2399--2434, Nov. 2006. Google ScholarDigital Library
A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph min-cuts. In Proc. 18-th International Conference on Machine Learning, 2001. Google ScholarDigital Library
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Workshop on Computational Learning Theory, 1998. Google ScholarDigital Library
O. Bousquet, O. Chapelle, and M. Hein. Measure based regularization. In Proc. 17-th Annual Conference on Neural Information Processing Systems, 2003.Google Scholar
O. Chapelle, A. Zien, and B. Scholkopf. Semi-supervised Learning. MIT Press, 2006.Google ScholarDigital Library
F. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google Scholar
R. Duda, D. Stork, and P. Hart. Pattern Classification. JOHN WILEY, 2nd edition, 2000. Google ScholarDigital Library
S. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In IEEE Conference on Computer Vision and Pattern Recognition, 2004. Google ScholarDigital Library
A. Ghoshal, P. Arcing, and S. Khudanpur. Hidden markov models for automatic annotation and content-based retrieval of images and video. In ACM Conference on Research & Development on Information Retrieval, 2005. Google ScholarDigital Library
J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In ACM Multimedia, 2004. Google ScholarDigital Library
J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Generalized manifold-ranking based image retrieval. IEEE Transaction on Image Processing, 15(10), 2006. Google ScholarDigital Library
R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press (Reprint Edition), 1999. Google ScholarDigital Library
L. Lavrenko, S. Feng, and R. Manmatha. Statistical models for automatic video annotation and retrieval. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2004.Google ScholarCross Ref
P. Over, T. Ianeva, W. Kraaij, and A. F. Smeaton. Trecvid 2005 - an overview. In TREC Video Retrieval Evaluation Online Proceedings. NIST, 2005.Google Scholar
P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transaction on Pattern Analysis and Machine Intelligence, 12(7), 1990. Google ScholarDigital Library
R. Rahmani and S. A. Goldman. Missl: Multiple-instance semi-supervised learning. In Proc. 23rd International Conference on Machine Learning, 2006. Google ScholarDigital Library
C. Rosenberg, M. Heberg, and H. Schneiderman. Semi-supervised self-training of object detection models. In 7-th IEEE Workshop on Applications of Computer Vision, 2005. Google ScholarDigital Library
G. Sapiro. Geometric Partial Differential Equation and Image Analysis. Cambridge University Press, 2001. Google ScholarDigital Library
M. Seeger. Learning with labeled and unlabeled data. Technical Report, Edinburgh University, 2001.Google Scholar
Y. Song, X.-S. Hua, L. Dai, and M. Wang. Semi-automatic video annotation based on active learning with multiple complementary predictors. In ACM International Workshop on Multimedia Information Retrieval, 2005. Google ScholarDigital Library
J. Tang, X.-S. Hua, T. Mei, G.-J. Qi, and X. Wu. Video annotation based on temporally consistent gaussian random field. Electronics Letters, 43(8), 2007.Google ScholarCross Ref
J. Tang, X.-S. Hua, G.-J. Qi, T. Mei, and X. Wu. Anisotropic manifold ranking for video annotation. In Proc. of the IEEE International Conference on Multimedia & Expo, 2007.Google ScholarCross Ref
H. Tong, J. He, M. Li, C. Zhang, and W. Ma. Graph based multi-modality learning. In Proc. ACM Multimedia, 2005. Google ScholarDigital Library
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Image annotation refinement using random walk with restarts. In Proc. ACM Multimedia, 2006. Google ScholarDigital Library
M. Wang, X.-S. Hua, Y. Song, X. Yuan, S. Li, and H.-J. Zhang. Automatic video annotation by semi-supervised learning with kernel density estimation. In Proc. ACM Multimedia, 2006. Google ScholarDigital Library
M. Wang, T. Mei, X. Yuan, Y. Song, and L. Dai. Video annotation by graph-based learning with neighborhood similarity. In Proc. ACM Multimedia, 2007. Google ScholarDigital Library
R. Yan and M. Naphade. Semi-supervised cross feature learning for semantic concept detection in videos. In IEEE Conference on Computer Vision and Pattern Recognition, July 2005. Google ScholarDigital Library
X. Yuan, X.-S. Hua, M. Wang, and X. Wu. Manifold-ranking based video concept detection on large database and feature pool. In Proc. ACM Multimedia, 2006. Google ScholarDigital Library
T. Zhang and F. Oles. A probability analysis on the value of unlabeled data for classification problems. In Proc. 17-th International Conference on Machine Learning, 2000.Google Scholar
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Proc. 17-th Annual Conference on Neural Information Processing Systems, 2003.Google Scholar
X. Zhu. Semi-supervised Learning with Graphs. PhD Thesis, CMU-LTI-05-192, 2005. Google ScholarDigital Library
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic function. In Proc. 20-th International Conference on Machine Learning, 2003.Google Scholar

Index Terms

Structure-sensitive manifold ranking for video concept detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Video semantic analysis based on structure-sensitive anisotropic manifold ranking

As a major family of semi-supervised learning (SSL), graph-based SSL has recently attracted considerable interest in the machine learning community along with application areas such as video semantic analysis. In this paper, we analyze the connections ...
Read More
Manifold ranking-based locality preserving projections
AICI'11: Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II

As a widely used linear dimensionality reduction technique, Locality Preserving Projections (LPP) preserves the neighborhood structure of the dataset by finding the optimal linear approximations to the eigenfunctions of the Laplace-Beltrami operator on ...
Read More
Cross-domain video concept detection: A joint discriminative and generative active learning approach

In this work, we study the problem of cross-domain video concept detection, where the distributions of the source and target domains are different. Active learning can be used to iteratively refine a source domain classifier by querying labels for a few ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '07: Proceedings of the 15th ACM international conference on Multimedia
September 2007
1115 pages
ISBN:9781595937025
DOI:10.1145/1291233
General Chairs:
Rainer Lienhart
University of Augsburg, Germany
,
Anand R. Prasad
DoCoMo Euro-Labs,Germany
,
Program Chairs:
Alan Hanjalic
Delft University of Technology, The Netherlands
,
Sunghyun Choi
Seoul National University, South Korea
,
Brian Bailey
University of Illinois at Urbana-Champaign
,
Nicu Sebe
University of Amsterdam, The Netherlands
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 September 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
manifold ranking
semi-supervised learning
video concept detection
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 49
  Total Citations
  View Citations
- 463
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Structure-sensitive manifold ranking for video concept detection

MM '07: Proceedings of the 15th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Video semantic analysis based on structure-sensitive anisotropic manifold ranking

Manifold ranking-based locality preserving projections

Cross-domain video concept detection: A joint discriminative and generative active learning approach