Skip to main content

2003 | OriginalPaper | Buchkapitel

Extracting Content Structure for Web Pages Based on Visual Representation

verfasst von : Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma

Erschienen in: Web Technologies and Applications

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and automatic page adaptation can benefit from this structure. This paper presents an automatic top-down, tag-tree independent approach to detect web content structure. It simulates how a user understands web layout structure based on his visual perception. Comparing to other existing techniques, our approach is independent to underlying documentation representation such as HTML and works well even when the HTML structure is far different from layout structure. Experiments show satisfactory results.

Metadaten
Titel
Extracting Content Structure for Web Pages Based on Visual Representation
verfasst von
Deng Cai
Shipeng Yu
Ji-Rong Wen
Wei-Ying Ma
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-36901-5_42