Top

Published in:

2016 | OriginalPaper | Chapter

ActionSnapping: Motion-Based Video Synchronization

Authors : Jean-Charles Bazin, Alexander Sorkine-Hornung

Published in: Computer Vision – ECCV 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Video synchronization is a fundamental step for many applications in computer vision, ranging from video morphing to motion analysis. We present a novel method for synchronizing action videos where a similar action is performed by different people at different times and different locations with different local speed changes, e.g., as in sports like weightlifting, baseball pitch, or dance. Our approach extends the popular “snapping” tool of video editing software and allows users to automatically snap action videos together in a timeline based on their content. Since the action can take place at different locations, existing appearance-based methods are not appropriate. Our approach leverages motion information, and computes a nonlinear synchronization of the input videos to establish frame-to-frame temporal correspondences. We demonstrate our approach can be applied for video synchronization, video annotation, and action snapshots. Our approach has been successfully evaluated with ground truth data and a user study.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A Geometric Approach to Image Labeling

next chapter A Minimal Solution for Non-perspective Pose Estimation from Line Correspondences

http://www.disneyresearch.com/publication/ActionSnapping.

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43, 16 (2011)CrossRef

Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)

Averbuch-Elor, H., Cohen-Or, D.: RingIt: ring-ordering casual photos of a temporal event. TOG 34, 33 (2015)CrossRefMATH

Ballan, L., Brostow, G.J., Puwein, J., Pollefeys, M.: Unstructured video-based rendering: interactive exploration of casually captured videos. TOG (SIGGRAPH) 29, 87 (2010)

Basha, T.D., Moses, Y., Avidan, S.: Photo sequencing. IJCV 110(3), 275–289 (2014)MathSciNetCrossRefMATH

Bazin, J.C., Malleson, C., Wang, O., Bradley, D., Beeler, T., Hilton, A., Sorkine-Hornung, A.: FaceDirector: continuous control of facial performance in video. In: ICCV (2015)

Beauchemin, S.S., Barron, J.L.: The computation of optical flow. ACM Comput. Surv. 27, 433–466 (1995)CrossRef

Bregler, C., Covell, M., Slaney, M.: Video rewrite: driving visual speech with audio. In: SIGGRAPH (1997)

Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. TPAMI 24, 1409–1424 (2002)CrossRef

10.

Dale, K., Sunkavalli, K., Johnson, M.K., Vlasic, D., Matusik, W., Pfister, H.: Video face replacement. TOG (SIGGRAPH Asia) 30(6) (2011)

11.

Diego, F., Serrat, J., López, A.M.: Joint spatio-temporal alignment of sequences. Trans. Multimedia 15, 1377–1387 (2013)CrossRef

12.

Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271 (1959)MathSciNetCrossRefMATH

13.

Evangelidis, G.D., Bauckhage, C.: Efficient subframe video alignment using short descriptors. TPAMI 35, 2371–2386 (2013)CrossRef

14.

Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Scandinavian Conference on Image Analysis (2003)

15.

Fossati, A., Dimitrijevic, M., Lepetit, V., Fua, P.: From canonical poses to 3D motion capture using a single camera. TPAMI 32, 1165–1181 (2010)CrossRef

16.

Freeman, W.T., Adelson, E.H., Heeger, D.J.: Motion without movement. In: SIGGRAPH (1991)

17.

Garrido, P., Valgaerts, L., Rehmsen, O., Thormaehlen, T., Perez, P., Theobalt, C.: Automatic face reenactment. In: CVPR (2014)

18.

Girshick, R.B., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.W.: Efficient regression of general-activity human poses from depth images. In: ICCV (2011)

19.

Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2004)CrossRefMATH

20.

Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., Seidel, H.: Markerless motion capture with unsynchronized moving cameras. In: CVPR (2009)

21.

Hsu, E., Pulli, K., Popovic, J.: Style translation for human motion. TOG (SIGGRAPH) 24, 1082–1089 (2005)CrossRef

22.

Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR (2013)

23.

Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., Seitz, S.M.: Being John Malkovich. In: ECCV (2010)

24.

Klose, F., Wang, O., Bazin, J.C., Magnor, M.A., Sorkine-Hornung, A.: Sampling based scene-space video processing. TOG (SIGGRAPH) 34, 67 (2015)

25.

Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)CrossRef

26.

Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR (2005)

27.

Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V.: Semi-automated video morphing. In: CGF (Eurographics Symposium on Rendering) (2014)

28.

Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V., Yu, J.: Automating image morphing using structural similarity on a halfway domain. TOG 33, 168 (2014)CrossRef

29.

Liao, Z., Joshi, N., Hoppe, H.: Automated video looping with progressive dynamism. TOG (SIGGRAPH) 32, 4 (2013)MATH

30.

Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. TPAMI (2011)

31.

Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28, 976–990 (2010)CrossRef

32.

Rother, C., Kolmogorov, V., Blake, A.: “GrabCut”: interactive foreground extraction using iterated graph cuts. TOG (SIGGRAPH) 23, 309–314 (2004)CrossRef

33.

Sand, P., Teller, S.J.: Video matching. TOG (SIGGRAPH) 23(3), 592–599 (2004)CrossRef

34.

Sand, P., Teller, S.J.: Particle video: Long-range motion estimation using point trajectories. IJCV 80, 72–91 (2008)CrossRef

35.

Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)

36.

Shi, J., Tomasi, C.: Good features to track. In: CVPR (1994)

37.

Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)

38.

Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. Technical Report CRCV-TR-12-01 (2012)

39.

Sunkavalli, K., Joshi, N., Kang, S.B., Cohen, M.F., Pfister, H.: Video snapshots: creating high-quality images from video clips. TVCG 18, 1868–1879 (2012)

40.

Urtasun, R., Fleet, D.J., Fua, P.: Temporal motion models for monocular and multiview 3D human body tracking. CVIU 104, 157–177 (2006)

41.

Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. TPAMI 34, 480–492 (2012)CrossRef

42.

Wang, C., Wang, Y., Lin, Z., Yuille, A.L., Gao, W.: Robust estimation of 3D human poses from a single image. In: CVPR (2014)

43.

Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)

44.

Wang, O., Schroers, C., Zimmer, H., Gross, M., Sorkine-Hornung, A.: VideoSnapping: interactive synchronization of multiple videos. TOG (SIGGRAPH) 33, 77 (2014)

45.

Wu, C.: Towards linear-time incremental structure from motion. In: International Conference on 3D Vision (3DV) (2013)

46.

Xu, X., Wan, L., Liu, X., Wong, T., Wang, L., Leung, C.: Animating animal motion from still. TOG (SIGGRAPH Asia) 27, 117 (2008)

47.

Yang, F., Bourdev, L.D., Shechtman, E., Wang, J., Metaxas, D.N.: Facial expression editing in video using a temporally-smooth factorization. In: CVPR (2012)

48.

Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. TPAMI 35, 2878–2890 (2013)CrossRef

49.

Zhou, F., De la Torre, F.: Canonical time warping for alignment of human behavior. In: NIPS (2009)

50.

Zhou, F., De la Torre, F.: Generalized time warping for multi-modal alignment of human motion. In: CVPR (2012)

Title: ActionSnapping: Motion-Based Video Synchronization
Authors: Jean-Charles Bazin
Alexander Sorkine-Hornung
Publisher: Springer International Publishing
Book: Computer Vision – ECCV 2016
Print ISBN: 978-3-319-46453-4

Electronic ISBN: 978-3-319-46454-1

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-3-319-46454-1_10

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner