Skip to main content

2004 | OriginalPaper | Buchkapitel

Automated Resolution of Noisy Bibliographic References

verfasst von : Markus Demleitner, Michael Kurtz, Alberto Accomazzi, Günther Eichhorn, Carolyn S. Grant, Steven S. Murray

Erschienen in: Classification, Clustering, and Data Mining Applications

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

We describe a system used by the NASA Astrophysics Data System to identify bibliographic references obtained from scanned article pages by OCR methods with records in a bibliographic database. We analyze the process generating the noisy references and conclude that the three-step procedure of correcting the OCR results, parsing the corrected string and matching it against the database provides unsatisfactory results. Instead, we propose a method that allows a controlled merging of correction, parsing and matching, inspired by dependency grammars. We also report on the effectiveness of various heuristics that we have employed to improve recall.

Metadaten
Titel
Automated Resolution of Noisy Bibliographic References
verfasst von
Markus Demleitner
Michael Kurtz
Alberto Accomazzi
Günther Eichhorn
Carolyn S. Grant
Steven S. Murray
Copyright-Jahr
2004
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-17103-1_49