2012 | OriginalPaper | Chapter
PPLSA: Parallel Probabilistic Latent Semantic Analysis Based on MapReduce
Authors : Ning Li, Fuzhen Zhuang, Qing He, Zhongzhi Shi
Published in: Intelligent Information Processing VI
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
PLSA(Probabilistic Latent Semantic Analysis) is a popular topic modeling technique for exploring document collections. Due to the increasing prevalence of large datasets, there is a need to improve the scalability of computation in PLSA. In this paper, we propose a parallel PLSA algorithm called PPLSA to accommodate large corpus collections in the MapReduce framework. Our solution efficiently distributes computation and is relatively simple to implement.