Expert Interview

Real-World Data: The Benefits, Limitations, and Future

May 12, 2023

Real-World Data (RWD) are broadly defined as data that have been collected as part of health care delivery and includes electronic health records (EHRs), insurance claims databases, and disease-specific registries. Although these data sources were not designed with research in mind, advances in data infrastructure have led to a growing interest in using them to answer emerging research questions. We asked Kevin Wilson, PhD, a Westat Associate Director who leads and coordinates Westat’s Data Science Group, for insight into the latest in RWD.

Q. What are RWD?

A. RWD are data derived from a multiple sources, such as EHRs, insurance claims, disease registries, wearable devices, administrative records, product consumption, and social media. When these existing sources, which are collected for a specific purpose, are used for research, they are termed “real-world data.”

Q. What are the benefits of applying RWD to solve clients’ issues?

A. One important benefit to using RWD is that it allows us to access large datasets more rapidly and less expensively. This is particularly important at a time when survey response rates are declining because we don’t have to recruit and interview participants. As an example, RWD were very useful in meeting the unprecedented challenge of COVID-19 because they enabled public health agencies to understand quickly the incidence and severity of the virus and rapidly develop vaccines and drugs to combat it. In other areas, the U.S. Food and Drug Administration uses RWD to identify adverse reactions to drugs, which is critical to ensuring drug safety.

Q. What are RWD’s limitations?

A. There are a few challenges associated with the use of RWD. When we link the multiple sources of digital data together, such as EHRs and insurance claims, it can increase the risk of identifying someone, particularly if it involves a person with a rare disease and distinct demographics.

There is also the possibility of measurement errors because it’s not always clear that the data sources are measuring the concepts consistently. For example, in health records, a doctor may code a patient’s visit differently based on what the insurance company may cover, which can introduce subtle shifts in meaning.

Another limitation is that the data may not be representative of the population in general because in using EHRs, data are gathered from a population that is slightly sicker than the general population and a population who has access to health care, excluding those who do not. Depending on the research question to be answered, these limitations can result in bias.

It’s also important to carefully assess the quality and completeness of the data. Data collected in health care facilities may not necessarily be comparable, and because the data are constantly evolving, they may lack reproducibility. However, we have procedures to assess the quality and potential for bias, and in general, the benefits outweigh the risks.

Q. How is Westat using RWD to support clients?

A. We have a number of projects in which we are using RWD. These include REDS-IV-P, DAWN, and VISION. We harmonize the data so it can be transformed it into one cohesive dataset. In REDS-IV-P, a study that links blood donor, component characteristics, and recipient outcomes, with a focus on pediatric populations, we conduct analyses related to transfusion medicine practices and outcomes. DAWN is a nationwide public health surveillance system designed to provide early warning and ongoing monitoring of emerging drug trends and characteristics of drug and/or alcohol-related emergency department visits. The RWD work allows the identification of drugs and drug combinations seen in ED visits nationwide. And for VISION, which leverages existing virtual networks, including the VISION flu network, we integrate massive amounts of data from 9 medical systems across the U.S. The RWD from these systems are sent through a secure data pipeline where Westat administers quality checks and performs analyses, enabling swift reporting to the CDC.

Q. How does Westat stand apart from competitors in harnessing RWD?

A. We have the technologies to integrate data from multiple sources and the tools to map data to common data models. We have exceptional statisticians, data scientists, and epidemiologists who understand the sources of bias and can apply corrections to the data using techniques like weighting and imputation. Plus, we have subject matter experts deeply knowledgeable about a wide range of health outcomes. So, it is this comprehensive understanding of what’s needed to harness RWD to solve our clients challenges that distinguishes us from competitors.

Q. What do you see as the future uses of RWD?

A. With the increased integration of data sources and availability of data, I can foresee RWD being harnessed to increase our understanding of population and individual health, which will enable us to tailor medical interventions more precisely to the needs of individual patients. The options for using RWD are manifold, and Westat will continue to bring our diverse resources to address the challenges. By bringing together skills in epidemiology, statistics, data science, and informatics, and with a broad knowledge of health outcomes, Westat is able to maximize the utility and quality of RWD and the real-world evidence it generates.


More on Data Science

Focus Areas

Real-World Data


Data Science


Data Science


Deep Dive with Our Experts

view all insights
Back to Top

How can we help?

We welcome messages from job seekers, collaborators, and potential clients and partners.

Get in Contact

Want to work with us?

You’ll be in great company.

Explore Careers