Sri Harsha Boppana, MBBS1, Manaswitha Thota, MD2, Sidharth Mahajan, MBBS3, Sachin Sravan Kumar Komati, MSc4, C. David Mintz, MD, PhD5 1Nassau University Medical Center, East Meadow, NY; 2Virginia Commonwealth University Medical Center, Richmond, VA; 3MacNeal Hospital Loyola Medicine, Berwyn, IL; 4Florida International University, Miami, FL; 5Johns Hopkins University School of Medicine, Baltimore, MD
Introduction: The incidence of pancreatic cancer continues to rise each year, with projections estimating pancreatic cancer to become the second-leading cause of cancer mortality by the year 2030 in the United States. This study describes current trends in pancreatic cancer and proposes a predictive model, developed using Long Short-Term Memory (LSTM) Neural Networks, to project future trends using demographic data.
Methods: We extracted mortality data from the Center for Disease Control and Prevention (CDC) Wide Ranging Online Data for Epidemiologic Research (WONDER) Database spanning 1999-2020, normalized it and encoded categorical variables such as age, sex, ethnicity, and race numerically. Using Studio R to analyze mortality trends, we developed an LSTM model known for its ability to capture and utilize long term data patterns. The model, trained with a sequence length of 20, predicted future data points based on the previous 20. We then assessed its accuracy and generalization with a test set. Ultimately, our model is able to utilize demographic variables to forecast annual mortality and survival rates.
Results: Our analysis revealed higher mortality rates in the 85+ age group, males, and those of non-hispanic ethnicities. The accuracy of the trained model was evaluated by calculating the Mean Squared Error, a measure of the difference between the predicted and actual value, with zero indicating no difference between the two. Our LSTM model achieved a training MSE of 0.0033 and validation MSE of 0.0028, indicating good reliability and accuracy. Figure 1 further illustrates the comparison of actual versus predicted mortality rates, visually corroborating the model’s accuracy.
Discussion: Our analysis highlighted clear mortality trends and demonstrated the model’s effectiveness in forecasting pancreatic cancer mortality. However, limitations due to the dataset’s lack of clinical variables (such as comorbidities, cancer stage, and prior treatments) and its reliance on a single dataset may affect external validity. Future research is needed to determine if the model can be applied for clinical prognosis or strategic planning. We ultimately created and validated an LSTM neural network model with the CDC WONDER dataset, showing promising results. Future development will aim to include clinical characteristics to create a more personalized predictive tool.
Figure: Figure 1: Comparison of Actual and Predicted Crude Rates for Training and Testing Data
Disclosures:
Sri Harsha Boppana indicated no relevant financial relationships.
Manaswitha Thota indicated no relevant financial relationships.
Sidharth Mahajan indicated no relevant financial relationships.
Sachin Sravan Kumar Komati indicated no relevant financial relationships.
C. David Mintz indicated no relevant financial relationships.
Sri Harsha Boppana, MBBS1, Manaswitha Thota, MD2, Sidharth Mahajan, MBBS3, Sachin Sravan Kumar Komati, MSc4, C. David Mintz, MD, PhD5. P1743 - Refining Pancreatic Cancer Projections: Mortality Trends Analysis and LSTM Model Predictions, ACG 2024 Annual Scientific Meeting Abstracts. Philadelphia, PA: American College of Gastroenterology.