When searching for RDF vocabularies, users often feel hindered by the lengthy description of a retrieved vocabulary from judging its relevance. A natural strategy for dealing with this issue is to generate a summary of the vocabulary description that compactly carries its main theme and reveals its relevance to the user’s information need. In this paper, we present a new solution to this problem of vocabulary summarization, which has been defined as ranking and selecting RDF sentences in our previous work. Firstly, we propose a novel bipartite graph representation of vocabulary description, on which we carry out a stochastic analysis of a random surfer’s behavior, from which we derive a new centrality measure for RDF sentences called BipRank. Further, we improve it by investigating the patterns of RDF sentences and employing their statistical features. Then, we combine BipRank with query relevance and cohesion metrics into an aggregate objective function to be optimized for the selection of RDF sentences. Our experiments on real-world vocabularies demonstrate the superiority of our approach to the baseline, and also validate its scalability in practice.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- BipRank: Ranking and Summarizing RDF Vocabulary Descriptions
- Springer Berlin Heidelberg
Neuer Inhalt/© ITandMEDIA