Synthetic control arms (SCAs) have gotten a lot of buzz in recent years, and rightfully so. They offer several key advantages for pharmaceutical and medical device clinical trials, where the demand for large numbers of participants, patients’ fears of receiving a placebo treatment, and the high cost of identifying, recruiting, and monitoring participants can pose significant challenges.
Simply put, rather than comparing a treatment arm with an untreated control arm, SCAs use real-world data to model the desired comparators. This approach, as outlined by BCG, can deliver significant benefits affecting both lower cost and greater efficiency. The concept is not a new one, having been used in areas where non-treatment control arms would be unethical, for example in oncology studies. In those instances, the control group was comprised of patients receiving the standard of care.
As we look to the future trajectory of SCAs, it’s important to consider the impact of emerging technologies. Advanced data science platforms, equipped to assimilate data from diverse sources, or eSource, coupled with the integration of machine learning and large-language models, will undoubtedly play a pivotal role in powering the clinical trials of tomorrow.
Phase 1: Collecting and cleaning the data
Recent technological advances have enabled the collection of health data from electronic health records (EHRs), claims data, patient-generated data from devices such as home medical equipment and wearables, registry studies, and historical clinical trial data. When standardized and combined into a database, these records can be used to create SCAs that mimic the characteristics of actual control groups.
These “neutral” SCAs have the potential to reduce or eliminate the need to recruit and enroll participants, increasing efficiency, reducing delays, and lowering trial costs – and, most importantly, speeding new therapies to patients. In a trial requiring a 500-patient treatment arm, instead of needing to recruit 1,000 patients (500 for treatment arm and 500 for control), trial investigators using an SCA model only need to locate 500 participants for the treatment arm.
Phase 2: Synthetic registries
Platforms such as REDCap Cloud, which can pull in data from these disparate sources, are ushering in a new era of cloud data collection and standardization. This is enabling the ability to normalize data from disparate sources to create synthetic registries that ultimately serve as databases of standardized patient data that can be used to generate SCAs.
Consider a trial to investigate a treatment for HIV. Using a synthetic registry of 10,000 patients, the trial investigator can search for potential participants using key information such as lab values or demographics information to identify patients with the disease who match the trial’s inclusion criteria.
Phase 3: Truly synthetic “patients”
The next frontier for SCAs, albeit a theoretical one, would use advances in artificial intelligence such as machine learning and large-language models (LLMs) to create entirely synthetic control arms. The LLM technology, by digesting the large volumes of actual patient data created during Phase 2, could be used to generate “fake” patient profiles to populate SCAs.
In summary, Synthetic Control Arms (SCAs) represent a potentially groundbreaking solution to the challenges faced within clinical research, offering efficiency gains and cost reductions. By utilizing real-world data to model comparators and advancing through phases of data collection, synthetic registries, and the theoretical prospect of truly synthetic “patients,” SCAs carry the potential to streamline trial processes and accelerate the delivery of new therapies to patients. The integration of emerging technologies, including advanced data science platforms and artificial intelligence, underscores the transformative trajectory of SCAs that will continue to evolve at a rapid pace.