2009 | OriginalPaper | Chapter
Latent Topic Extraction from Relational Table for Record Matching
Authors : Atsuhiro Takasu, Daiji Fukagawa, Tatsuya Akutsu
Published in: Discovery Science
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We propose a latent feature extraction method for record linkage. We first introduce a probabilistic model that generates records with their latent topics. The proposed generative model is designed to utilize the co-occurrence among the attributes of the record. Then, we derive a topic estimation algorithm using the Gibbs sampling technique. The estimated topics are used to identify records. The proposed algorithm works in an unsupervised way; i.e., we do not need to prepare labor-intensive training data. We evaluated the proposed model using bibliographic records and proved that the proposed method tended to perform better for records with more attributes by utilizing their co-occurrence.