Sign language recognition constitutes a challenging field of research in computer vision. Common problems like overlap, ambiguities, and minimal pairs occur frequently and require robust algorithms for feature extraction and processing. We present a system that performs person-dependent recognition of 232 isolated signs with an accuracy of 99.3% in a controlled environment. Person-independent recognition rates reach 44.1% for 221 signs. An average performance of 87.8% is achieved for six signers in various uncontrolled indoor and outdoor environments, using a reduced vocabulary of 18 signs.
The system uses a background model to remove static areas from the input video on pixel level. In the tracking stage, multiple hypotheses are pursued in parallel to handle ambiguities and facilitate retrospective correction of errors. A winner hypothesis is found by applying high level knowledge of the human body, hand motion, and the signing process. Overlaps are resolved by template matching, exploiting temporally adjacent frames with no or less overlap. The extracted features are normalized for person-independence and robustness, and classified by Hidden Markov Models.