2013 | OriginalPaper | Chapter
Text Mining
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
This chapter looks at a particular type of classification task, where the objects are text documents. A method of processing the documents for use by the classification algorithms given earlier in this book using a
bag-of-words representation
is described.
An important special case of text classification arises when the documents are web pages. The automatic classification of web pages is known as hypertext categorisation. The differences between standard text classification and
hypertext categorisation
are illustrated and issues relating to the latter are discussed.