Abstract

Automation of information extraction from eligibility criteria will provide a breakthrough in effective utilization of information for patient search in clinical databases. A majority of eligibility criteria contain temporal information associated with medical conditions and events. This project creates a novel natural language processing (NLP) pipeline for extraction and classification of temporal information as historic, current and planned from free-text eligibility criteria. The pipeline uses pattern learning algorithms for extracting temporal information and trained Random Forest classifier for classification. The pipeline achieved an accuracy of 0.82 in temporal data detection and classification with an average precision of 0.83 and recall of 0.80 in temporal data classification.

Published in: International Conference on Information Society (i-Society 2016)

  • Date of Conference: 10-13 October 2016
  • DOI: 10.2053/iSociety.2016.0024
  • ISBN: 978-1-908320-62-9
  • Conference Location: Dublin, Ireland