Insights
Perspective

How NLP Expands the Power of Clinical Notes

February 25, 2026

Electronic health records (EHRs) can capture quantifiable aspects of patient care, such as diagnoses, prescriptions, imaging, and lab test results, all of which are essential for advancing medical research. But additional data about therapeutic measures administered to or accessed by patients to prevent disease could provide a more complete picture of the nation’s health. In a recent project, Westat uncovered this information by using a research approach that went beyond checked boxes and standardized medical codes. This approach enabled the extraction of critical data from unstructured sections of EHRs: physician notes.

“We had a great opportunity in front of us,” says Westat’s Kevin Wilson, PhD, Vice President for Clinical Research and project lead. “We needed to develop a way to pull data from EHRs that may not exist among the medical diagnosis and procedure codes used for insurance purposes, since coding for therapeutic measures to prevent respiratory viruses is not mandatory. Sensitive to our client’s needs, our plan was to develop natural language processing [NLP] methods to extract this information from the text-based narrative sections of patient records, ensuring patient confidentiality.” Wilson says this decision was based on the success of previous studies using a rules-based NLP model to pull the information from free-text clinical notes.

Applying a Rules-Based Algorithm

With the NLP methods created, the Westat team implemented a multistage, rules-based algorithm and applied a series of rules to allow a line-by-line examination of these narratives and capture the essential data. Plus, because NLP can read modifiers, it could decide if a preventive therapeutic measure was recommended, rejected, or received.

The team applied the algorithm to a sample of 20,000 patients who received care at the University of Colorado Anschutz Medical Campus from August 1 to December 31, 2023. Through a manual review of 400 individual notes, the team assessed performance and then analyzed concurrence with structured data using precision and recall as evaluation metrics.

“When measured against human review, the algorithm detected nearly all of the cases for the designated respiratory illnesses being studied and identified a high proportion of preventive therapy applications for 3 respiratory viruses,” Wilson notes. “Had we relied only on structured EHR fields, we wouldn’t have been able to extract this detailed information.”

Benefiting a Range of Therapeutic Areas

Wilson says Westat’s research demonstrates that NLP could be used to assess free-text clinical notes across a range of therapeutic areas. It can be used to identify chronic cough, internal bleeding events, surgical site infections, fall risk, and a range of mental health issues, Wilson notes.

“NLP methods have also been effective in identifying early predictors and longitudinal progression of diseases, their causes, treatments, and means of safeguarding against them,” Wilson explains. “And all of these applications can help maintain and improve the health of everyday Americans.”

Learn More

A deeper look at this research is featured in a newly published article in Frontiers in Digital Health (February 2026), where Wilson served as lead co-author.

Insights

Deep Dive with Our Experts

view all insights

How can we help?

We welcome messages from job seekers, collaborators, and potential clients and partners.

Get in Contact

Want to work with us?

You’ll be in great company.

Explore Careers
Back to Top
Privacy Overview
Westat

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognizing you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Additional Cookies

This website uses the following additional cookies:

  • Google Analytics
  • Google Tag Manager
  • Google Search Console
  • Google Sitekit