2004 | OriginalPaper | Chapter
Automatic HTML to XML Conversion
Authors : Shijun Li, Mengchi Liu, Tok Wang Ling, Zhiyong Peng
Published in: Advances in Web-Age Information Management
Publisher: Springer Berlin Heidelberg
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We present a new approach to automatically convert HTML documents into XML documents. It first captures the inter-blocks nested structure, then the intra-blocks nested structure, which consists of blocks including headings, lists, paragraphs and tables in HTML documents, by exploiting both formatting information and structural information implied by HTML tags.