The flow of relevant real world data (RWD) to the life sciences industry has exploded in recent years, and most (80%) of that new content comes in the form of unstructured data. Unstructured data includes all the information that is shared in textual narrative and conversational formats. And in the age of Twitter and chatbots, that covers quite a bit of ground.
This information can be drawn from social media posts, journal articles, virtual customer contact requests, telehealth conversations, medical notes, and research information – and that is just the short list.
This content can be rich in value, but that value is often underutilized because these documents are difficult to manually review, translate, and analyze. Imagine, for example, trying to read every journal article that might mention a specific disease or side effect; or monitoring all Twitter posts for relevant references. It's a near impossible feat.
This results in pharmaceutical companies often ignoring these resources, focusing instead on more structured real world data assets. That can result in missing valuable trends that might otherwise inform drug development and commercialization efforts and delay understanding of risks or opportunities to accelerate research.
However, this barrier to knowledge can be eliminated when pharmaceutical companies use natural language processing (NLP) to review and analyze unstructured content, and should be incorporated as a vital part of any real world data strategy.
What is NLP and why should you care?
NLP is a form of artificial intelligence (AI) that can understand, interpret, and analyze human language. The technology uses algorithms to automatically read and extract relevant data from unstructured content, then analyzes, standardizes, and summarizes points of interest in clean, comparable formats.
It can even clarify complicated text. For example, NLP algorithms can be trained to identify novel connections and normalize inconsistent terminology – i.e., interpreting “hot,” “sweats,” “feverish,” and “temperature” as all relating to fevers. Adjacent meanings and relationships found within the text can also be interpreted to provide a more accurate understanding of the information. This makes it possible to quickly capture all references to a specific trend in a patient population, or treatment, or disease category.
NLP solutions can also conduct quantitative feature extraction of numeric information and ranges within text (i.e., references to height, weight, dates, or age) to further elucidate the context in which they are shared.
In addition to data extraction, NLP provides normalization, structure and enrichment to data sets that can be fed into applications, visualization tools, AI/ML, and other platforms that make it easy to share and compare results and use the information to inform key decisions.
NLP Use Cases
When developers integrate NLP into their real world data program, they can access a world of new information that can inform different stages of drug development and unlock new findings.
Many clients use NLP to assist with literature reviews and as a way to stay abreast of the latest trends and standards of care. Using repeatable algorithms, NLP workflows can be run automatically, ensuring the most up-to-date published sources become part of the organization's knowledge base.
In early phase development, NLP is often used to review claims databases, electronic medical records, and other real world data sets to understand the disease journey, identify unmet needs, prioritize molecules, and define endpoints of interest.
For example, IQVIA recently worked with a pharmaceutical company that wanted to understand prescribing patterns for patients at heightened risk of metabolic imbalance. In this case, NLP is used to review the electronic medical records (EMR) of over 5,000 patients in 10 countries to understand how physician prescribing patterns in the real world vary from guidelines and self-reported practices. NLP allows the study team to derive reasons for treatment decisions thanks to normalized data derived from physician notes, helping also identify possible deviation from guidelines. The data gleaned enables the capture of granular real world insights that would otherwise not have been available.
NLP can also bring consistency to terminology and add clarity to the relationships between concepts, such as the impact of one activity or a drug on a gene. It can also capture behavioral information around lifestyles, eating habits, comorbidities, wellness activities, and other behaviors of interest.
NLP can 'listen' to the voice of the patient, helping to accelerate appropriate responses to the common and less common questions, challenges, and concerns they have. Automatically categorizing topics and subtopics of patient and physician interactions allow life sciences organizations to be more responsive to the needs of patients without having to manually review the data.
NLP offers life sciences companies a way to unlock the value in unstructured data, and to pull insights from a variety of sources. This includes documents captured in locations such as SharePoint or data from subscription services, which are often locked away in silos that create inefficiencies. Having technology that can bring all of that information to a central location for analysis solves the unstructured data dilemma by creating a central database of knowledge.
Wherever there is rich information within free texts, or document content, or snippets of texts, NLP brings with it the potential and important role in pulling out that information and informing teams across life sciences and healthcare.
To learn more about IQVIA NLP Insights Hub, click here.