2011 | OriginalPaper | Chapter
An Active Learning Process for Extraction and Standardisation of Medical Measurements by a Trainable FSA
Authors : Jon Patrick, Mojtaba Sabbagh
Published in: Computational Linguistics and Intelligent Text Processing
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Medical scores and measurements are a very important part of clinical notes as clinical staff infer a patient’s state by analysing them, especially their variation over time. We have devised an active learning process for rapid training of an engine for detecting regular patterns of scores, measurements and people and places in clinical texts. There are two objectives to this task. Firstly, to find a comprehensive collection of validated patterns in a time efficient manner, and second to transform the captured examples into canonical forms. The first step of the process was to train an FSA from seed patterns and then use the FSA to extract further examples of patterns from the corpus.
The next step was to identify partial true positives (PTP) from the newly extracted examples. A manual annotator reviewed the extractions to identify the partial true positives (PTPs) and added the corrected form of these examples to the training set as new patterns. This cycle was continued until no new PTPs were detected. The process showed itself to be effective in requiring 5 cycles to create 371 true positives from 200 texts. We believe this gives 95% coverage of the TPs in the corpus.