Using NLP to Improve PICO Element Identification and Extraction for SLRs and Evidence-based Medicine
Last Updated on July 24, 2023 by Editorial Team
Author(s): Gaugarin Oliver
Originally published on Towards AI.
Natural Language Processing
Although the term βevidence-based medicineβ (EBM) first appeared in print in the early 1990s, the history of this now-popular approach to clinical practice goes back much further. In the mid-18th century, James Lind, a Scottish naval physician, experimented with citrus-based scurvy treatments on several comparable groups of sick sailors. And there is evidence of what can be loosely called EBM stretching all the way back to the ancient Greeks.
The modern approach to EBM β which bases medical decisions on the evidence summarized in systematic literature reviews (SLRs), which themselves are based on analysis of randomized controlled trials (RCTs) of treatments of specific medical conditions β is still relatively new by comparison. But itβs rapidly evolving, especially as medical professionals confront data volumes that are growing exponentially: Health-care data is responsible for around 30 percent of the worldβs data volume, according to RBC Capital Markets, and by 2025 will grow at an annual rate of 36 percent.
This explosion in health-care data has in part led to the large-scale adoption of the PICO model for developing specific clinical questions from RCTs. PICO is a mnemonic that stands for:
- Population/problem: Addresses the characteristics of populations involved and the specific characteristics of the disease or disorder
- Intervention: Addresses the primary intervention (including treatments, procedures, or diagnostic tests) along with any risk factors
- Comparison: Compares the efficacy of any new interventions with the primary intervention
- Outcome: Measures the results of the intervention, including improvements or side-effects
PICO helps evidence-based practitioners develop precise clinical questions and searchable keywords to answer those questions and, considering the above data volumes, is a critical tool. But itβs also often extremely time-consuming and requires a high level of technical skill and medical domain knowledge. As we pointed out in a previous blog, SLRs require large teams of experts and require an average of more than 1,000 hours to complete β and, in many cases, the development of a specific research question can comprise a significant chunk of these hours. Most PICO searches involve several steps, including question formation, keyword identification, search strategy development, search execution, and literature review. And thatβs not even considering the growing amount of health-care data we mentioned earlier: According to Nye et al., around 100 RCT manuscripts were published every day in 2015, and that number has almost certainly grown since then.
Machine learning (ML) and natural language processing (NLP) can help facilitate the automatic identification of PICO elements from this vast sea of information. This helps evidence-based practitioners develop precise research questions faster and more accurately, speeding up the entire SLR (and EBM) process.
The automation of PICO identification and extraction
Because the sheer volume of primary evidence available is becoming too challenging to wade through manually, researchers are experimenting with automating these tasks based on NLP techniques. Indeed, the rapidly-growing amount of health-care data means itβs βpractically impossible for physicians to know which is the best medical intervention for a given patient group and condition.β Adding to the challenge is that RCT results are typically published as unstructured free text, not structured data easily analyzed or queried through SQL and other standard techniques.
Itβs precisely this last challenge that makes NLP such a suitable tool for automating PICO element identification and extraction. βMethods to extract PICO elements for subsequent inspection could facilitate inclusion assessments for systematic reviews by allowing reviewers to rapidly judge relevance with respect to each PICO element,β writes Wallace et al. βFurthermore, automated PICO identification could expedite data extraction for systematic reviews, in which reviewers manually extract structured data to be reported and synthesized.β
But according to Kang et al., previous attempts using support vector machine (SVM) and conditional random field (CRF) have stalled partly due to βthe lack of publicly available, annotated corporaβ for training along with a dearth of available tools to perform named entity recognition (NER) and information retrieval (IR).
Thatβs changing, however, with recent rapid advancements in ML and NLP, including the advent of deep learning, neural networks, and the increasing availability of publicly available annotated corpora for training and evaluation. Kang et al. cite the biLSTM-CRF model as particularly effective at NER for PICO-related applications, adding that the emergence of transfer learning in NLP is helping address the βhigh demand of large data for training neural networks.β
Examples of NLP-automated PICO identification and extraction
Among these advancements, the release by Nye et al. of EBM-NLP in early 2018 β a corpus of 5,000 richly annotated medical article abstracts describing clinical RCTs β has been crucial. EBM-NLP is especially suitable for PICO element extraction because it takes into account trial population characteristics, interventions, comparators, and outcomes making access to relevant data much less of an issue than in the past (although issues around data availability to train and evaluate NLP models for PICO identification and extraction persist). Despite these lingering issues, significant progress has been made over the past several years by a number of researchers:
- Bui et al. (2016) developed an NLP-based system that automatically summarizes full-text scientific articles and analyzes them for PICO values and SLR-related data elements. The model showed better recall (91.2% vs. 83.8%) and density of relevant sentences ( 59% precision vs. 39%) when compared to human written summaries.
- Wallace et al. (2016) proposed a method of speeding up evidence synthesis of full-text articles for PICO using supervised distant supervision (SDS). This approach βlearns to automatically extract sentences pertaining to PICO elements from full-text articles describing RCTsβ using a large semi-structured corpus (the Cochrane Database of Systematic Reviews, or CDSR).
- Kang et al. (2019) created an open-source PICO statement extraction tool to process RCTs using NER for PICO elements, Unified Medical Language System (UMLS) encoding, and XML outputs. Although using only a small dataset for training, it achieved βbetter performance than conventional machine learning models trained on a larger corpus,β demonstrating that itβs possible to develop NLP models for PICO applications without needing large amounts of training data.
- Brockmeier et al. (2019) trained a NER model using Nye et al.βs publicly available corpus, implementing the model as a recurrent neural network (RNN) and applying it to medical abstracts to identify and extract PICO elements. βThe occurrences of words tagged in the context of specific PICO contexts are used as additional features for a relevancy classification model,β the authors explained. βSimulations of the machine learning-assisted screening are used to evaluate the work saved by the relevancy model with and without the PICO featuresβ¦ Inclusion of PICO features improves the performance metric on 15 of the 20 collections, with substantial gains on certain systematic reviews.β
- Jin et al. (2020) proposed a new deep learning model to recognize PICO elements based on bi-LSTM along with conditional random field architecture, but adding an additional bi-LSTM layer βso that the contextual information from surrounding sentences can be gathered to help infer the interpretation of the current one.β Instead of using large corpora, the researchers also proposed using adversarial training and unsupervised pre-training to prime the model. In testing on benchmark datasets, the model outperformed previous bests by between 5.5 percent and 7.9 percent. The code is available here.
- Marshall et al. (2020) developed Trialstreamer, a system to automatically find and categorize RCTs. Itβs grown into a publicly-available annotated database of more than 700,000 RCTs derived from PubMed and the World Health Organization International Clinical Trials Registry Platform. The system extracts free-text descriptions of PICO elements, mapping them to the standardized Medical Subject Headings (MeSH) thesaurus. In the first five months of 2020, the researchers write, the system was able to categorize and index an average of 142 RCTs per day.
The promise of NLP for PICO identification and extraction
PICO is a crucial tool for evidence-based practitioners looking to evaluate the relevancy of RCTs to formulate specific and answerable research questions (and related keywords), but it can be time-consuming, prone to human error, and requires a great deal of process and medical expertise. It has also become increasingly difficult to manually search for and identify PICO elements considering the fast-growing amount of relevant health-care data being created.
The promise of NLP for PICO identification and extraction means these practitioners can achieve as good or better results when scanning the literature for PICO elements, but for far less manual labor. CapeStartβs machine learning engineers, data scientists, and subject matter experts can help your next systematic literature review and PICO process with a range of healthcare-focused NLP and data annotation solutions, from pre-trained NLP models and model development to pre-annotated datasets for model training.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI