Written by: Jeff Elton, PhD; Pyeush Gurha; and Vivek Vaidya
It was reported in the July 18th edition of Nature that medicine is plagued by untrustworthy clinical trials. Citing seminal work by John Carlisle, M.D., an anesthesiologist in the U.K National Health Service, the article summarized key conclusions that 44% of trials contained impossible statistics, incorrect calculations, or duplicated numbers or figures, with 26% of trials being assessed as having results impossible to trust.
Trust is the foundation of good research. Research lays the foundation for regulatory decisions and the generation of evidence guiding diagnostic and treatment decisions. Research that can’t be trusted weakens the entire system and places a negative pallor over good research.
When ConcertAI designed its suite of clinical trial technologies, it did so from the perspective of building trustworthy research, with clear provenance of the data, where inspection of study data and replication of study results were both facilitated. We had years of experience working in electronic medical record environments – an increasing source for clinical data for both retrospective and prospective research – and understood the location of different clinical concepts (e.g., structured versus unstructured data sources), where missingness may be confounding a direct data extraction process, and exactly what the current standard of care would be measuring and collecting versus a specific study or trial design.
Our goal was four-fold: (a) provide a set of solutions that automated multimodal data collection, lowering the burden on healthcare provider research teams, allowing more direct time with the patient, and decreasing time spent after clinical hours doing data entry; (b) assure that all data sources support the study concepts, while simultaneously informing study designs and patient selection that are executable from standard-of-care data sources; (c) establish source data provenance and intra-study metadata that automatically stand as a history of activity available for review at any time; and (d) create a study package for the final analytic-ready data that includes fully processed data, source data supporting key elements, and the full history of provenance and metadata on all interactions.
Based on this, we were able to replicate the broadest set of study parameters in our own electronic case report form library, with each element mapped to the highest veracity location (e.g., some concepts such as lab values may be in structured data, notes, and appended PDF documents from the diagnostic lab) where that specific eCRF sub-element can be derived across an array of electronic medical record environments. Doing this required deep and nuanced understanding of different electronic medical record systems, organ system cancers, and sometimes even the medical reporting and recording procedures of specific research sites.
This accomplished several things. First, all environments were made equivalent as we used the integration layer to handle the different workflows, data model structure, and document storage locations of different Electronic Medical Record (EMR) environments. Second, this level of standardization allowed for newer machine based Natural Language Processing and AI tools to be deployed with ever greater accuracy and recall performance, providing meaningful levels of automation and data-search and data-entry avoidance. For example, in a presentation at ASCO 2023 with trial sponsor, Bristol Meyers Squibb, we reported that BMS had achieved 55% automation of all data needing to be written to electronic case report forms. They thus became learning systems where corrections to data is used as a labeled dataset to train the system to be more accurate. Lastly, these systems produced a definitive record of all aspects of the patient record and the process of completing a specific variable with a new level of completeness, provenance, and traceability – a high accuracy system assuring integrity of processes and data supporting trustworthy trials.
Higher standards are possible and being now deployed through the latest clinical trial technologies, such as our solutions. The work of biopharma innovators and the unmet medical needs of patients are both too important to relegate to flawed processes and obsolete technologies. While the research of Dr. Carlisle was about ‘willful’ errors, omissions, and fabricated entries, it was also about low-integrity methods that resulted in untrustworthy data and trials. Technologies with low burdens on research sites, direct clinical data extraction, high traceability, and complete provenance will all contribute to trustworthiness. These will increasingly underpin the real work at hand: advancing needed new medical innovations to patients with confidence and alacrity.