2015 | OriginalPaper | Chapter
Word Segmentation of Micro Blogs with Bagging
Authors : Zhenting Yu, Xin-Yu Dai, Si Shen, Shujian Huang, Jiajun Chen
Published in: Natural Language Processing and Chinese Computing
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
This paper describes the model we designed for the Chinese word segmentation Task of NLPCC 2015. We firstly apply a word-based perceptron algorithm to build the base segmenter. Then, we use a Bootstrap Aggregating model of bagging which improves the segmentation results consistently on the three tracks of closed, semi-open and open test. Considering the characteristics of Weibo text, we also perform rule-based adaptation before decoding. Finally, our model achieves F-score 95.12% on closed track, 95.3% on semi-open track and 96.09% on open track.