U.S. flag An official website of the United States government

On Oct. 1, 2024, the FDA began implementing a reorganization impacting many parts of the agency. We are in the process of updating FDA.gov content to reflect these changes.

  1. Home
  2. Drugs
  3. Development & Approval Process | Drugs
  4. CDER Small Business & Industry Assistance (SBIA)
  5. Improving Data Quality with Centralized Statistical Monitoring - with Dr. Paul Schuette and Xiaofeng (Tina) Wang
  1. CDER Small Business & Industry Assistance (SBIA)

Improving Data Quality with Centralized Statistical Monitoring - with Dr. Paul Schuette and Xiaofeng (Tina) Wang

CDER Small Business and Industry Assistance Chronicles Podcast

Podcast

Dr. Weber: Welcome to the CDER Small Business and Industry Assistance, or SBIA, Chronicles Podcast.

Today’s topic: Improving Data Quality with Centralized Statistical Monitoring

My name is Dr. Ellicia Weber and today we are joined by Tina Wang, Statistician, and Dr. Paul Schuette, Deputy Division Director, in the Office of Biostatistics Division of Analytics and Informatics within FDA’s Center for Drug Evaluation and Research. Paul & Tina will be discussing FDA’s experiences with a centralized statistical monitoring tool.

Thank you both for joining us today!

Dr. Schuette: Our pleasure.

Dr. Paul Schuette
Dr. Paul Schuette, Deputy Division Director | Office of Biostatistics Division of Analytics and Informatics | CDER | FDA

Dr. Weber: Let’s begin by discussing clinical investigations monitoring. Though FDA’s regulations require sponsors to monitor the conduct and progress of their clinical investigations, the regulations are not specific about how sponsors are to conduct such monitoring.

Dr. Schuette: FDA is a data driven organization. We rely on the quality and integrity of the data submitted by sponsors to make appropriate decisions about efficacy and safety of medical products. However, FDA does not have the resources to inspect every clinical trial site for data quality and integrity. And this is an issue for sponsors too.

Dr. Weber: So it’s very important to prioritize inspection resources.

Dr. Schuette: Yes. In practice, there are a range of approaches to monitoring plans that can vary depending on multiple factors. Traditionally, inspections were conducted through 100% source data verification, which refers to the process of confirming that study data included in efficacy and safety analyses reflect source data obtained during the clinical investigation, such as the data recorded on the Case Report Form. This process can be expensive, inefficient, and still fail to detect underlying problems and issues, such as data anomalies.

Alternatively, a risk-based approach to monitoring focuses sponsor oversight activities on preventing or mitigating important and likely risks to data quality, and to the processes critical to human subject protection and trial integrity. It does not suggest any less vigilance in the oversight of clinical investigations. It is a dynamic process and better allows for continual improvement in trial conduct and oversight.

There is a growing consensus that risk-based approaches to monitoring focused on risks to the most critical data elements and processes necessary to achieve study objectives are more likely to ensure subject protection and overall study quality than 100% source data verification and routine visits to all clinical sites. So, risk-based monitoring could actually improve sponsor oversight of clinical investigations.

Dr. Weber: FDA has published two guidance documents on the use of centralized statistical monitoring as part of a risk-based monitoring plan: “Oversight of Clinical Investigations — A Risk-Based Approach to Monitoring” and “A Risk-Based Approach to Monitoring of Clinical Investigations Questions and Answers.”

Dr. Schuette: Yes, and we encourage sponsors to refer to these guidance documents, which provide FDA’s current recommendations regarding monitoring practices. And we encourage sponsors to consider a change in approach to monitoring.

As you mentioned, these guidance documents discuss centralized monitoring, which is a remote evaluation carried out by the sponsor at a location other than the sites at which the clinical investigation is being conducted. Let me make just a few key points about this kind of evaluation.

First, centralized statistical monitoring is increasingly feasible with the advent of computerized systems and the increasing use of electronic records which facilitate remote access to electronic data and, to select source data. And second, statistical assessments using data captured with paper case report forms or via electronic data capture may permit timely identification of clinical sites that require additional training, monitoring, or both. We expect that the pharmaceutical and device industries will, for the foreseeable future, continue to use some amount of on-site monitoring, although we anticipate decreased use of on-site monitoring with evolving monitoring methods and technological capabilities.

Xiaofeng (Tina) Wang
Xiaofeng (Tina) Wang | Statistician | Office of Biostatistics Division of Analytics and Informatics | CDER | FDA

So, we encourage greater use of centralized monitoring practices, where appropriate. Of course this depends on various factors, including the sponsor’s use of electronic systems; the sponsor’s access to subjects’ electronic records, if applicable; and the timeliness of data entry from the paper case report form, and communication tools available to the sponsor and study site.

Dr. Weber: Your team in CDER’s Office of Biostatistics has had experience with a centralized statistical monitoring tool as part of an ongoing Cooperative Research and Development Agreement between CluePoints Inc., and FDA. Can you tell us about this?

Ms. Wang: Certainly, but first let me provide some background about the data collection and submission process for clinical trials. In a typical randomized clinical trial, clinical trials sites are overseen by an investigator and are required to use the same protocol and the same electronic case report form to collect subject data.

Data collected from the subjects' case report forms, their lab tests, their responses to questionnaires, and other resources then are submitted to FDA using CDISC standards such as the Study Data Tabulation Model, or SDTM, and the Analysis Data Model, or ADaM.

As you mentioned, there is an ongoing Cooperative Research and Development Agreement, or CRADA, whose purpose is to identify problematic sites that exhibit data anomalies and the possible data quality issues in multi-site clinical trials submitted to the agency to support the approval of new therapies.

For the CRADA, we used the SDTM data since SDTM data is usually only minimally processed and is therefore closer to the raw data. ADaM data has usually been cleaned and processed to a greater extent. We used the Statistical Monitoring Applied to Research Trials, or SMART software, in the CRADA to check the consistency of collected data across all sites and identify those sites that differ substantially from others. Using advanced data analytics, the software calculates a data inconsistency score for each site.

I can’t show the visual in the podcast, but I will try to explain it. The final product of the software is a bubble plot, where the x-axis is the number of randomized subjects at a site, the y-axis is the site data inconsistency score, and each bubble represents a clinical site. The size of each bubble is proportional to the size of each site, with larger bubbles indicating larger sites.

A site is flagged as potentially anomalous if its data inconsistency score exceeds a threshold value, which the user may select to control the false discovery rate. Larger sites with higher scores are of greater potential concern and recommended for follow-up and possible inspection.

And sensitivity analyses can be performed by excluding laboratory data and the questionnaire data, because those data contribute to a disproportionate number of tests, potentially driving the results of those analyses. The atypical sites flagged previously may or may not be flagged above the threshold in the sensitivity analyses.

Dr. Weber: If anyone in our audience is interested in reading the paper and taking a look at the bubble plots, we are linking the paper from the Resources section of this episode’s webpage. 

Ms. Wang: I appreciate that!

Ellicia: Tina, tell us more about the results.

Ms. Wang: After identifying signals, we can further dive into data listings and explore problematic data. The software can compare the data distributions between the outlying site and all sites in graphs. For example, one site was found to have mostly males enrolled, while the overall distribution was more nearly balanced.

The software also supports exploring subject level data patterns. For example, a subject whose blood pressure is constant across all visits would be an anomaly. In addition to comparing data patterns at the site level, the software also allows users to compare country level data patterns, with drill down capability as well.

Dr. Weber: What are some of the potential causes of data anomalies?

Ms. Wang: Great question. Potential causes of data anomalies can include:

  • Errors due to technical problems such as mis-calibrated instruments, incorrect unit conversions, or other factors;
  • Sloppiness or incorrect reporting;
  • Tampered, fabricated, or altered data; or
  • Sites that are atypical of the underlying population.

Dr. Weber: Could you discuss some potential limitations of this tool?

Ms. Wang: Yes, as with any tool, there are limitations and challenges to any centralized statistical monitoring tool. Challenges can be generated by messy data – for example by a lack of conformance to data standards, and disruptions due to the recent pandemic. Comparisons across sites are best achieved when there are many sites in the clinical trial rather than just a few.

Moreover, if the data contamination rate is sufficiently high, then anomalies may not be detected. Even when detected, the clinical significance of data anomalies may not be obvious. If the SDTM data do not contain the variables that are of interest, then a data anomaly on these aspects will not be detected.

Dr. Weber: Paul and Tina, thank you for joining us today! Are there any final thoughts you’d like to convey?

Dr. Schuette: Yes. We encourage our industry colleagues to explore these alternative approaches, and to understand that there is a great value in using tools such as centralized statistical monitoring. Centralized Statistical Monitoring and tools such as the SMART software can be effective in identifying data anomalies, providing insights to assist the efforts of statistical and clinical reviewers, and helping ensure data quality and integrity for regulatory submissions.

We hope to extend our past efforts and build on them with new technologies such as artificial intelligence, and to include new trial practices, such as decentralized clinical trials. We encourage sponsors to tailor their monitoring plans to the needs of the trial.

Dr. Weber: You can find a link to the full SBIA Chronicles article at fda.gov/cdersbiachronicles. Also visit fda.gov/cdersbia to stay connected with upcoming webinars and conferences, sign up for SBIA email updates, and follow SBIA on LinkedIn. Thanks for tuning in!

Resources:

Back to Top