The temporal segmentation of a video into shots is a fundamental prerequisite for video retrieval. There are two types of shot boundaries: abrupt shot changes (“cuts”) and gradual transitions. Several high-quality algorithms have been proposed for detecting cuts, but the successful detection of gradual transitions remains a surprisingly difficult problem in practice. In this paper, we present an unsupervised approach for detecting gradual transitions. It has several advantages. First, in contrast to alternative approaches, no training stage and hence no training data are required. Second, no thresholds are needed, since the used clustering approach separates classes of gradual transitions and non-transitions automatically and
for each video. Third, it is a generic approach that does not employ a specialized detector for each transition type. Finally, the issue of removing false alarms caused by camera motion is addressed: in contrast to related approaches, it is not only based on low-level features, but on the results of an appropriate algorithm for camera motion estimation. Experimental results show that the proposed approach achieves very good performance on TRECVID shot boundary test data.