2015 | OriginalPaper | Buchkapitel
Clustering Retrieved Web Documents to Speed Up Web Searches
verfasst von : Rani Qumsiyeh, Yiu-Kai Ng
Erschienen in: Computational Science and Its Applications -- ICCSA 2015
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Current web search engines, such as Google, Bing, and Yahoo!, rank the set of documents
S
retrieved in response to a user query and display the URL of each document
D
in
S
with a title and a snippet, which serves as an abstract of
D
. Snippets, however, are not as useful as they are designed for, which is supposed to assist its users to quickly identify results of interest, if they exist. These snippets fail to (i) provide distinct information and (ii) capture the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is
ambiguous
, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the user’s intended request without requiring additional inputs. Furthermore, a document title is not always a good indicator of the content of the corresponding document. All of these design problems can be solved by our proposed query-based cluster and labeler, called
QClus
.
QClus
generates concise clusters of documents covering various subject areas retrieved in response to a user query, which saves the user’s time and effort in searching for specific information of interest without having to browse through the documents one by one. Experimental results show that
QClus
is
effective
and
efficient
in generating high-quality clusters of documents on specific topics with informative labels.