Can natural language processing improve the completeness of immunization data?
Harnessing natural language processing to enhance immunization information
Challenge
The VISION surveillance collaboration aims to support public health efforts by routinely collecting and synthesizing respiratory illness data and publishing timely statistical analyses, including vaccine effectiveness (VE).
Surveillance methods often rely on the structured data elements within the electronic health record (EHR) to identify key health characteristics, procedures, and conditions. However, in some instances, details may only be documented in the narrative, text-based sections of the EHR.
Solution
Rule-based natural language processing (NLP) methods have the potential to identify health conditions and procedures using a dictionary of key terms and pattern-matching techniques. We developed NLP rules that leveraged grammatical dependencies within the text and accounted for a range of text variations using a combination of synthetic and publicly available data. Our rules aimed to identify immunizations for respiratory infections, influenza, and respiratory syncytial virus.
We applied the rules to a large sample of individuals from the EHRs of a large health care system and measured performance by conducting a manual review of individual notes, assessing concurrence with structured data, and comparing performance with prior methods.
Results
We were able to identify additional immunizations for respiratory infections, influenza, and RSV that were not present in the structured data. This research demonstrated the feasibility and utility of NLP methods to enhance structured vaccine information from EHRs and served to validate the information contained within the structured EHR data.
Focus Areas
Biomedical Informatics and Data Coordination Clinical Research Disease Surveillance Public HealthCapabilities
Biomedical Informatics and Data Coordination Biostatistics and Epidemiology Data Analytics, Clinical Data Science, and AI Data Collection Data Science EHR Harmonization and Analysis Natural Language Processing and Text Analytics Synthetic Data GenerationSenior Expert Contact
Kevin Wilson
Vice President
-
Perspective
Westat Experts Explore the Future of Quantum ComputingJanuary 2026
Westat experts joined other leading researchers, government innovators, industry partners, and students at Unlocking the Potential of Quantum Computing: A Symposium for the Quantum-Curious, hosted…
-
Perspective
Workforce Development Drives Solutions to Rural Healthcare ShortagesDecember 2025
Rural communities face disproportionate shortages of healthcare professionals, resulting in diminished access to care, longer wait times, fewer available services, and, ultimately, poorer health outcomes.…
-
Expert Interview
Responsible Synthetic Data: Unlocking Insights While Safeguarding PrivacyDecember 2025
From electronic health records (EHRs) to federal statistics, synthetic data are rapidly transforming how organizations share and analyze information, offering new ways to unlock insights…