Erschienen in:

2004 | OriginalPaper | Buchkapitel

Discovering Aspects of Web Pages from Their Referential Contexts in the Web

verfasst von : Koji Zettsu, Yutaka Kidawara, Katsumi Tanaka

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer Berlin Heidelberg

Enthalten in: Professional Book Archive

Zugang erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

There are an enormous number of Web pages of unknown authorship, and even though Web search engines precisely evaluate the relevancy of Web page contents, a user cannot be sure whether a search result shows credible information. Considering that a Web page is referred to by other pages in various contexts through links, these contexts indicate the reputation of the page. For example, some pages may refer to a company’s page as “an excellent local company” and still other pages may refer to it as “a member of a certain research project”, while the company’s page itself might contain only product and service information. Such references are called “aspects” of the Web page, as distinguished from the content of the page. In this paper, we propose an approach for discovering aspects for characterizing Web pages based on their contexts. We define criteria for selecting “aspectual” Web content based on (1) its strength of association with the page based on the logical structure of the Web (i.e. Web document structure and link structure), (2) its novelty of content compared to the page and (3) its typicality among multiple contexts. We evaluate how these criteria affect aspect discovery results. We also explain the grouping of Web pages based on aspect similarity. This helps us to find Web pages that are referred to in the same way even though their content is different.

Springer Professional

Discovering Aspects of Web Pages from Their Referential Contexts in the Web

Premium Partner