In a significant advancement for the medical research community, a multidisciplinary team at UT Southwestern Medical Center has unveiled an innovative artificial intelligence (AI)-enabled pipeline designed to swiftly and accurately extract crucial information from complex, free-text medical records.
This pioneering technique, discussed in the journal npj Digital Medicine, has the potential to substantially decrease the time required to prepare datasets for analysis in research studies.
David Hein, M.S., a Data Scientist in the Lyda Hill Department of Bioinformatics at UT Southwestern and the first author of the study, emphasized the traditional challenges researchers face in synthesizing detailed datasets from free-text records. “Constructing highly detailed, accurate datasets from free-text medical records is extremely time-consuming, often requiring extensive manual chart review,” he explained.
The groundbreaking study outlines how the team utilized an AI-powered large language model (LLM) to analyze over 2,200 kidney cancer pathology reports. The goal was to assess the model’s capacity to recognize and categorize different tumor types.
Collaboration played a critical role in refining this workflow. The team, which included AI scientists, pathologists, clinicians, and statisticians, undertook several rounds of testing to enhance the model’s ability to process complex and nuanced medical information. Their findings were validated against pre-existing electronic medical record (EMR) data to confirm the accuracy of the results.
The outcome of their work was impressive, achieving 99% accuracy in identifying tumor types and 97% accuracy in determining whether the cancer had metastasized.
Payal Kapur, M.D., a Professor of Pathology and Urology and co-leader of the study, underlined the intricacies involved in using AI for data extraction from narrative reports. “The biggest challenge in training AI to extract data from narrative reports is that clinicians use a wide range of open-ended terms to describe the same finding,” she said.
Kapur added that, “It’s not as simple as counting ‘yes-no’ results. Every report contains hundreds of details in narrative form. But with proper input and oversight, an AI model can efficiently review and categorize vast amounts of records with speed and accuracy.”
An essential phase of the project involved testing the AI pipeline across a larger dataset of more than 3,500 internal kidney cancer pathology reports, which yielded similarly favorable results. This testing benefited substantially from the high-quality, curated data provided by UT Southwestern’s Kidney Cancer Program.
James Brugarolas, M.D., Ph.D., Director of the Kidney Cancer Program, noted that a collaborative approach among various specialties was crucial. “The key is collaborative teamwork across specialties to refine AI instructions and ensure accuracy,” he commented.
While this study primarily focused on kidney cancer, the researchers suggest that the methodology could have applications beyond this specific tumor type. Andrew Jamieson, Ph.D., an Assistant Professor and Principal Investigator in the Lyda Hill Department of Bioinformatics, remarked, “There is no ‘one-size-fits-all’ model for medical data extraction. But our study outlines key strategies that can help other researchers use AI-powered LLMs more effectively in their own specialties.”
He further expressed enthusiasm about potential further developments: “We’re excited to continue refining this process and expanding AI’s role in medical research.”
The collaborative effort featured contributions from other notable researchers at UTSW, including Bingqing Xie, Ph.D., Joseph Vento, M.D., Lindsay Cowell, Ph.D., Scott Christley, Ph.D., Ameer Hamza Shakur, Ph.D., Michael Holcomb, M.S., Alana Christie, M.S., Neil Rakheja, and AJ Jain, all of whom played pivotal roles in the study’s success.
Dr. Kapur holds the Jan and Bob Pickens Distinguished Professorship in Medical Science, in Memory of Jerry Knight Rymer and Annette Brannon Rymer and Mr. and Mrs. W.L. Pickens, while Dr. Brugarolas holds the Sherry Wigley Crow Cancer Research Endowed Chair in Honor of Robert Lewis Kirby, M.D. All three are also members of the esteemed Simmons Cancer Center.
The research was made possible through funding from a grant provided by the National Cancer Institute’s Kidney Cancer Specialized Program of Research Excellence and an endowment from the Brock Fund for Medical Science Chair in Pathology.
UT Southwestern Medical Center, recognized as one of the nation’s leading academic medical centers, integrates cutting-edge biomedical research with exceptional clinical care and education. The institution’s faculty have garnered six Nobel Prizes and include numerous distinguished members of national academies. With over 3,200 full-time faculty members, UT Southwestern continues to push the boundaries of medical research and clinical therapeutics, attending to over 140,000 hospitalized patients and managing millions of outpatient visits annually.
image source from:utsouthwestern