skip to main content
10.3115/1119250.1119276dlproceedingsArticle/Chapter ViewAbstractPublication PagessighanConference Proceedingsconference-collections
Article
Free Access

Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff

Published:11 July 2003Publication History

ABSTRACT

In this paper, we roughly described the procedures of our segmentation system, including the methods for resolving segmentation ambiguities and identifying unknown words. The CKIP group of Academia Sinica participated in testing on open and closed tracks of Beijing University (PK) and Hong Kong Cityu (HK). The evaluation results show our system performs very well in either HK open track or HK closed track and just acceptable in PK tracks. Some explanations and analysis are presented in this paper.

References

  1. Chen, K. J. & S. H. Liu, 1992, "Word Identification for Mandarin Chinese Sentences," Proceedings of 14th Coling, pp. 101--107 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chen, C. J., M. H. Bai, & K. J. Chen, 1997," Category Guessing for Chinese Unknown Words," Proceedings of the Natural Language Processing Pacific Rim Symposium, 35-40, Thailand.Google ScholarGoogle Scholar
  3. Chen, K. J. & Ming-Hong Bai, 1998, "Unknown Word Detection for Chinese by a Corpus-based Learning Method," international Journal of Computational linguistics and Chinese Language Processing, Vol. 3, #1, pp. 27--44Google ScholarGoogle Scholar
  4. Chen, Keh-jiann, 1999," Lexical Analysis for Chinese- Difficulties and Possible Solutions", Journal of Chinese Institute of Engineers, Vol. 22. #5, pp. 561--571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chen, K. J. & Wei-Yun Ma, 2002. Unknown Word Extraction for Chinese Documents. In Proceedings of COLING 2002, pages 169--175 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tseng, H. H. & K. J. Chen, 2002. Design of Chinese Morphological Analyzer. In Proceedings of SIGHAN, pages 49--55 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ma Wei-Yun & K. J. Chen, 2003. A bottom-up Merging Algorithm for Chinese Unknown Word Extraction. In Proceedings of SIGHAN Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    SIGHAN '03: Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
    July 2003
    193 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 11 July 2003

    Qualifiers

    • Article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader