North Shore University Hospital/Zucker School of Medicine at Hofstra University Manhasset, NY
Anthony J. Papale, MD1, Deepti Mahajan, MD1, Michael Ramada, MD1, Robert Flattau, BS2, Nandan Vithlani, BS2, Tiffany Zavadsky, NP, RN3, Anthony Carvino, 3, Daniel King, MD, PhD4, Sandeep Nadella, MD3 1North Shore University Hospital/Zucker School of Medicine at Hofstra University, Manhasset, NY; 2Zucker School of Medicine Hofstra University, Hempstead, NY; 3Northwell Health, Manhasset, NY; 4Northwest Health, Manhasset, NY
Introduction: Mucinous cystic lesions of the pancreas are precursor lesions of pancreatic cancer. These cysts are typically identified incidentally and then referred for specialist evaluation.We have enhanced our previously described natural language processing (NLP) tool, which is capable of automatically identifying radiological reports with abnormal pancreatic features, to now allow for the automated classification of these lesions as high or low risk using large language models (LLMs).
Methods: We identified 1001 reports from February 14th to March 6th, 2024, as having pancreatic lesions using our NLP model. We manually classified cystic lesions in this dataset as no cyst, low-risk cyst, or high-risk cyst based on ACG guidelines. We deployed five LLMs (PaLM 2, Gemini 1.5 Pro, GPT-3.5 Turbo, GPT-4, GPT-4o) to perform automated cyst classification on a subset of 90 reports. Model performance was measured using Cohen’s Kappa coefficient for inter-rater agreement and sensitivity for classifying high-risk cysts. The highest-performing LLM was then used to assess the remaining 911 reports. Patients with high-risk cysts had their charts manually reviewed to determine if specialist follow-up was scheduled.
Results: Within the subset of 90 reports, GPT-4 outperformed all other models with a Cohen’s Kappa of 0.983 and sensitivity of 1.00 for high-risk cysts. It maintained a Cohen’s Kappa of 0.962 and sensitivity of 0.988 with the inclusion of the remaining 911 reports.Of the total 1001 reports, 519 (51.8%) had pancreatic cysts, and 482 (48.2%) were excluded for having known pancreatic cancer, tumors, or lesions not described as cysts. Among the 519 cysts, 321 (61.8%) were classified as high-risk, requiring gastroenterology follow-up; however, only 84 patients (26.2%) had an appointment scheduled in our health system.
Discussion: We have demonstrated the potential utility of an NLP and LLM tool in the identification and classification of pancreatic cysts at scale. GPT-4 was the best performing LLM with outcomes comparable to our manual review. While 61.8% of cysts were recognized as high-risk and should receive GI follow-up based on pancreatic cyst management guidelines, only 26.2% had follow-up appointments scheduled. We now aim to perform a crossover study to assess if active identification and referral of patients using our automated tool will improve follow-up metrics relative to the current institutional process.
Figure: Figure 1. Comparison of Manual and Automated Classification of Pancreatic Cysts
Note: The table for this abstract can be viewed in the ePoster Gallery section of the ACG 2024 ePoster Site or in The American Journal of Gastroenterology's abstract supplement issue, both of which will be available starting October 27, 2024.
Disclosures:
Anthony Papale indicated no relevant financial relationships.
Deepti Mahajan indicated no relevant financial relationships.
Michael Ramada indicated no relevant financial relationships.
Robert Flattau indicated no relevant financial relationships.
Nandan Vithlani indicated no relevant financial relationships.
Tiffany Zavadsky indicated no relevant financial relationships.
Anthony Carvino indicated no relevant financial relationships.
Daniel King indicated no relevant financial relationships.
Sandeep Nadella indicated no relevant financial relationships.
Anthony J. Papale, MD1, Deepti Mahajan, MD1, Michael Ramada, MD1, Robert Flattau, BS2, Nandan Vithlani, BS2, Tiffany Zavadsky, NP, RN3, Anthony Carvino, 3, Daniel King, MD, PhD4, Sandeep Nadella, MD3. P1768 - Development and Validation of Large Language Models to Aid Automated Classification of Pancreatic Cysts at Scale, ACG 2024 Annual Scientific Meeting Abstracts. Philadelphia, PA: American College of Gastroenterology.