ABSTRACT
We present a method for browsing videos by directly dragging their content. This method brings the benefits of direct manipulation to an activity typically mediated by widgets. We support this new type of interactivity by: 1) automatically extracting motion data from videos; and 2) a new technique called relative flow dragging that lets users control video playback by moving objects of interest along their visual trajectory. We show that this method can outperform the traditional seeker bar in video browsing tasks that focus on visual content rather than time.
Supplemental Material
Available for Download
Slides from the presentation
Supplemental material for Video browsing by direct manipulation
- Accot, J. and Zhai, S. (1997). Beyond Fitts' law: models for trajectory-based HCI tasks. CHI. p. 295--302. Google ScholarDigital Library
- Appert, C. and Fekete, J. (2006). OrthoZoom scroller: 1D Multi-Scale Navigation. CHI. P. 21--30. Google ScholarDigital Library
- Autodesk Maya. http://www.autodesk.com/Google Scholar
- Baudel, T., Fitzmaurice, G., Buxton, W., Kurtenbach, G., Tappen, C. and Liepa, P. (2002). Drawing system using design guides. US Patent # 6,377,240.Google Scholar
- Beauchemin, S.S. and Barron, J.L. (1995). The computation of optical flow. ACM Computing Surveys, 27(3). p. 433--467. Google ScholarDigital Library
- Beaudouin-Lafon, M. (2000). Instrumental Interaction: An interaction model for designing post-WIMP user interfaces. CHI. p. 446--453. Google ScholarDigital Library
- Beaudouin-Lafon, M. (2001). Novel interaction techniques for overlapping windows. UIST. p. 153--154. Google ScholarDigital Library
- Bezerianos, A., Dragicevic, P. and Balakrishnan, R. (2006). Mnemonic rendering: an image-based approach for exposing hidden changes in dynamic displays. UIST. p. 159--168. Google ScholarDigital Library
- Buxton, W. (1986). There's more to interaction than meets the eye: some issues in manual input. In User Centered System Design: New Perspectives on Human-Computer Interaction. Lawrence Erlbaum. p. 19--337.Google ScholarCross Ref
- Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8). p. 790--799. Google ScholarDigital Library
- Dragicevic, P., Huot, S. and Huot, S. (2002). SpiraC-lock: a continuous and non-intrusive display for up-coming events. CHI Extended Abstracts. p. 604--605. Google ScholarDigital Library
- Goldman, D.B., Curless, B., Salesin, D. and Seitz, S.M. (2006). Schematic storyboarding for video visualization and editing. SIGGRAPH. p. 862--871. Google ScholarDigital Library
- Guimbretière, F. (2000). FlowMenu: combining command, text, and data entry. UIST. p. 213--216. Google ScholarDigital Library
- Hölzl, R. (1996). How does 'dragging' affect the learning of geometry? International Journal of Computers for Mathematical Learning, 1(2). p. 169--187.Google Scholar
- Hutchins, E.L., Hollan, J.D. and Norman, D.A. (1987). Direct manipulation interfaces. In Human-Computer interaction: A Multidisciplinary Approach. R. M. Baecker, Ed. Morgan Kaufmann. p. 468--470. Google ScholarDigital Library
- Irani, M., Anadan, P. and Hsu, H. (1995). Mosaic based representations of video sequences and their applications. Intl. Conference on Computer Vision. p. 605--611. Google ScholarDigital Library
- Kim, C. and Hwang, J. (2002). Fast and automatic video object segmentation and tracking for content-based applications. IEEE Trans. Circuits and Systems for Video Technology, 12. p. 122--129. Google ScholarDigital Library
- Kimber D., Dunnigan, T., Girgensohn, A., Shipman, F., Turner, T. and Yang, T. (2007). Trailblazing: Video playback control by direct object manipulation. ICME. p. 1015--1018.Google Scholar
- Li, F.C., Gupta, A., Sanocki, E., He, L. and Rui, Y. (2000). Browsing digital video. CHI. p. 169--176. Google ScholarDigital Library
- Lowe, D.G. (2004), Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2). p. 91--110. Google ScholarDigital Library
- Maier, D., Hesser, J. and Männer, R. (2003). Fast and accurate closest point search on triangulated surfaces and its application to motion estimation. WSEAS Intl. Conference on Signal, Speech and Image Processing.Google Scholar
- Moscovich, T., Hughes, J.F. (2004). Navigating Documents with the Virtual Scroll Ring. UIST. P. 57--60. Google ScholarDigital Library
- Ngo, T., Cutrell, D., Dana, J., Donald, B., Loeb, L. and Zhu, S. (2000). Accessible animation and customizable graphics via simplicial configuration modeling. SIGGRAPH. p. 403--410. Google ScholarDigital Library
- Nowozin, S. autopano-sift -- Automatic panorama stitching package. http://user.cs.tu-berlin.de/~nowozin/autopano-sift/Google Scholar
- NTT-AT Dragri. http://www.dragri-fran.comGoogle Scholar
- Pack, C. and Mingolla E. (1998). Global induced motion and visual stability in an optic flow illusion. Vision Research, 38. p. 3083--3093.Google Scholar
- Peker, K.A., Divakaran, A. Sun, H. (2001). Constant pace skimming and temporal sub-sampling of video using motion activity. IEEE International Conference on Image Processing, Vol. 3. p. 41417.Google ScholarCross Ref
- Proteau, L. and Masson, G. (1997). Visual perception modifies goal-directed movement control: Supporting evidence from a visual perturbation paradigm. The Quarterly Journal of Experimental Psychology, 50, 726--741.Google ScholarCross Ref
- Ramos, G. and Balakrishnan, R. (2003). Fluid interaction techniques for the control and annotation of digital video. UIST. p. 105--114. Google ScholarDigital Library
- Schneiderman, B. (1992). Designing the user interface: Effective strategies for effective human-computer interaction. Addison-Wesley. Google ScholarDigital Library
- Shim, C. and Chang, J. (2004). Trajectory-based video retrieval for multimedia information systems. Proc. ADVIS, LNCS 3261. p. 372--382. Google ScholarDigital Library
- Shoemake, K. (1992). ARCBALL: a user interface for specifying three-dimensional orientation using a mouse. Graphics Interface. p. 151--156. Google ScholarDigital Library
- Sinha, S., Frahm, J.M. and Pollefeys M. (2006). GPU-based video feature tracking and matching. Tech. Rep. TR06-012, University of North Carolina at Chapel Hill.Google Scholar
- 34.Snavely, N., Seitz, S.M. and Szeliski, R. (2006). Photo tourism: exploring photo collections in 3D. ACM Transactions on Graphics, 25(3). p. 835-846. Google ScholarDigital Library
- Su, C., Liao, I.M. and Fan, K. (2005). A motion-flow-based fast video retrieval system. ACM SIGMM International Workshop on Multimedia Information Retrieval. p. 105--112. Google ScholarDigital Library
- Taubin, G. (1995). Curve and surface smoothing without shrinkage. Intl. Conf. on Comp. Vision. p. 852. Google ScholarDigital Library
- Thorne, M., Burke, D. and van de Panne, M. (2004). Motion doodles: an interface for sketching character motion. SIGGRAPH. p. 424--431. Google ScholarDigital Library
- Truong, B.T. and Venkatesh, S. (2007). Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications, and Applications, 3(1). p. 1--37. Google ScholarDigital Library
- Wei, J. (2003). An efficient motion estimation,method for MPEG-4 video encoder. IEEE Transactions on Consumer Electronics, 49(2). p. 441--446. Google ScholarDigital Library
- Yilmaz, A., Javed, O. and Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4). p. 1--45. Google ScholarDigital Library
- Zivotofsky, A.Z. (2004). The Duncker illusion: inter-subject variability, brief exposure, and the role of eye movements in its generation. Investigative Ophthalmology and Visual Science, 45. p. 2867--2872.Google Scholar
Index Terms
- Video browsing by direct manipulation
Recommendations
Direct manipulation video navigation in 3D
CHI '13: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsDirect Manipulation Video Navigation (DMVN) systems allow a user to navigate a video by dragging an object along its motion trajectory. These systems have been shown effective for space-centric video browsing. Their performance, however, is often ...
Direct Annotation: A Drag-and-Drop Strategy for Labeling Photos
IV '00: Proceedings of the International Conference on Information VisualisationAnnotating photos is such a time-consuming, tedious and error-prone data entry task that it discourages most owners of personal photo libraries. By allowing users to drag labels such as personal names from a scrolling list and drop them on a photo, we ...
Interactive audio-visual video browsing
MM '06: Proceedings of the 14th ACM international conference on MultimediaWe present the AV-ZoomSlider interface for video browsing. It complements existing approaches, such as storyboards and video skims by enabling users to interactively navigate along the time line of a video file. Our solution smoothly integrates position-...
Comments