research-article

Fine-grained kitchen activity recognition using RGB-D

Authors:
Jinna Lei

University of Washington

University of Washington
View Profile

,
Xiaofeng Ren

Intel Labs

Intel Labs
View Profile

,
Dieter Fox

University of Washington

University of Washington
View Profile

UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous ComputingSeptember 2012Pages 208–211https://doi.org/10.1145/2370216.2370248

Published:05 September 2012Publication History

UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing

Pages 208–211

ABSTRACT

We present a first study of using RGB-D (Kinect-style) cameras for fine-grained recognition of kitchen activities. Our prototype system combines depth (shape) and color (appearance) to solve a number of perception problems crucial for smart space applications: locating hands, identifying objects and their functionalities, recognizing actions and tracking object state changes through actions. Our proof-of-concept results demonstrate great potentials of RGB-D perception: without need for instrumentation, our system can robustly track and accurately recognize detailed steps through cooking activities, for instance how many spoons of sugar are in a cake mix, or how long it has been mixing. A robust RGB-D based solution to fine-grained activity recognition in real-world conditions will bring the intelligence of pervasive and interactive systems to the next level.

References

L. Bo, X. Ren, and D. Fox. Depth Kernel Descriptors for Object Recognition. In IROS, pages 821--826, 2011.Google ScholarCross Ref
M. Buettner, R. Prasad, M. Philipose, and D. Wetherall. Recognizing daily activities with RFID-based sensors. In Ubicomp, pages 51--60, 2009. Google ScholarDigital Library
K. Lai, L. Bo, X. Ren, and D. Fox. A scalable tree-based approach for joint object and pose recognition. In AAAI, 2011.Google ScholarCross Ref
I. Laptev. On space-time interest points. Int'l. J. Comp. Vision, 64(2):107--123, 2005. Google ScholarDigital Library
R. Messing, C. Pal, and H. Kautz. Activity recognition using the velocity histories of tracked keypoints. In ICCV, pages 104--111. IEEE, 2009.Google ScholarCross Ref
I. Oikonomidis, N. Kyriazis, and A. Argyros. Efficient model-based 3d tracking of hand articulations using kinect. In BMVC, 2011.Google ScholarCross Ref
X. Ren and C. Gu. Figure-ground segmentation improves handled object recognition in egocentric video. In CVPR, pages 3137--3144. IEEE, 2010.Google ScholarCross Ref
J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, volume 2, page 3, 2011. Google ScholarDigital Library
E. Spriggs, F. De La Torre, and M. Hebert. Temporal segmentation and activity classification from first-person sensing. In First Workshop on Egocentric Vision, 2009.Google Scholar
Q. Tran, G. Calcaterra, and E. Mynatt. Cook's collage. Home-Oriented Informatics and Telematics, 2005.Google ScholarCross Ref
J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. A scalable approach to activity recognition based on object use. In ICCV, pages 1--8, 2007.Google ScholarCross Ref
R. Ziola, S. Grampurohit, N. Landes, J. Fogarty, and B. Harrison. Examining interaction with general-purpose object recognition in LEGO OASIS. In Visual Languages and Human-Centric Computing, pages 65--68, 2011.Google ScholarCross Ref

Index Terms

Fine-grained kitchen activity recognition using RGB-D
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

A supervised learning approach for fast object recognition from RGB-D data
PETRA '14: Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments

Object recognition serves obvious purposes in assisted living environments, where robotic devices can be used as companions to assist humans in need. The recent introduction of vision based sensors, which are able to extract depth sensing information ...
Read More
Recognizing multi-view objects with occlusions using a deep architecture

Image-based object recognition is employed widely in many computer vision applications such as image semantic annotation and object location. However, traditional object recognition algorithms based on the 2D features of RGB data have difficulty when ...
Read More
RGB-D action recognition using linear coding

In this paper, we investigate action recognition using an inexpensive RGB-D sensor (Microsoft Kinect). First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Such descriptors are very robust to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing
September 2012
1268 pages
ISBN:9781450312240
DOI:10.1145/2370216
General Chair:
Anind K. Dey
Carnegie Mellon University
,
Program Chairs:
Hao-Hua Chu
National Taiwan University, Taiwan
,
Gillian Hayes
University of California, Irvine
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 September 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
RGB-D
action recognition
activity tracking
kitchen
object recognition
smart spaces
Qualifiers
- research-article
Conference

Acceptance Rates
UbiComp '12 Paper Acceptance Rate58of301submissions,19%Overall Acceptance Rate764of2,912submissions,26%
More
Upcoming Conference
UBICOMP '24

Sponsor:

sigchi

sigchi

UBICOMP '24: The 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing

October 5 - 9, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 81
  Total Citations
  View Citations
- 887
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fine-grained kitchen activity recognition using RGB-D

UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A supervised learning approach for fast object recognition from RGB-D data

Recognizing multi-view objects with occlusions using a deep architecture

RGB-D action recognition using linear coding