nach oben

Erschienen in:

2005 | OriginalPaper | Buchkapitel

Finding Structure in a Document Collection

verfasst von : Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau

Erschienen in: Text Mining

Verlag: Springer New York

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Prediction methods look at stored examples with correct answers and project answers for new examples. One would expect that if we cannot obtain answers for the training examples, then the process cannot be completed. Given a collection of documents, we have no problem transforming the unstructured set of words for each document into a structured spreadsheet. But the last column also must be filled in. In Figure 5.1, we see a spreadsheet, a list of labels, and the spreadsheet column containing the labeled answers. Someone must compose a list of potential labels. Given the list, someone assigns labels to the documents. Sometimes label assignment can be automated, such as the label that a company’s stock price has risen. In most instances, such as topic assignment to newswire articles, the assignment of labels is done by humans, and this can be a tedious and expensive task. Is there any way to assign labels automatically to a document collection? We will discuss this task. Not only will the labels be assigned, but the list of labels will also be determined automatically. Because such key information is missing from the problem description, our expectations for accurate predictive performance should be reduced from standard prediction applications with labeled data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Information Retrieval and Text Mining

Nächstes Kapitel Looking for Information in Documents

Titel: Finding Structure in a Document Collection
verfasst von: Sholom M. Weiss
Nitin Indurkhya
Tong Zhang
Fred J. Damerau
Verlag: Springer New York
Buch: Text Mining
Print ISBN: 978-0-387-95433-2

Electronic ISBN: 978-0-387-34555-0

Copyright-Jahr: 2005
DOI: https://doi.org/10.1007/978-0-387-34555-0_5

Springer Professional

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner