Sri Harsha Boppana, MBBS1, Manaswitha Thota, MD2, Sidharth Mahajan, MBBS3, Sachin Sravan Kumar Komati, MSc4, C. David Mintz, MD, PhD5 1Nassau University Medical Center, East Meadow, NY; 2Virginia Commonwealth University Medical Center, Richmond, VA; 3MacNeal Hospital Loyola Medicine, Berwyn, IL; 4Florida International University, Miami, FL; 5Johns Hopkins University School of Medicine, Baltimore, MD
Introduction: The incidence of lower GI cancer, including colorectal, anal, and small intestinal cancer, continues to increase, with colorectal cancer, in particular, growing in incidence in younger individuals. This study describes current trends in lower GI cancer and proposes a predictive model, developed using Long Short-Term Memory (LSTM) neural networks to project future trends using demographic data.
Methods: We extracted mortality data from the Center for Disease Control and Prevention (CDC) Wide-Ranging Online Data for Epidemiological Research (WONDER) Database spanning 1999, normalized it and encoded categorical variables such as age, sex, ethnicity, and race numerically. Using Studio R to analyze mortality trends, we developed an LSTM model known for its ability to capture and utilize long term data patterns. The model, trained with a sequence length of 20, predicted future data points based on the previous 20. We then assessed its accuracy and generalization with a test set. Ultimately, our model is able to utilize demographic variables to forecast annual mortality and survival rates.
Results: Mortality rates are highest among older individuals, particularly those aged 80-84 and 85+, with males and Hispanics experiencing the most significant effects in these groups. The accuracy of the trained model was evaluated by calculating the Mean Squared Error, a measure of the difference between the predicted and actual value, with zero indicating no difference between the two. Our LSTM model achieved a training MSE of 0.00053 and a validation MSE of 0.0145, indicating good reliability and accuracy. Figure 1 further illustrates the comparison of actual versus predicted mortality rates, visually corroborating the model’s accuracy.
Discussion: Our analysis highlighted clear mortality trends and demonstrated the model’s effectiveness in forecasting lower GI cancer mortality. However, limitations due to the dataset’s lack of clinical variables (such as comorbidities, cancer stage, and prior treatments) and its reliance on a single dataset may affect external validity. Future research is needed to determine if the model can be applied for clinical prognosis or strategic planning. We ultimately created and validated an LSTM neural network model with the CDC WONDER dataset, showing promising results. Future development will aim to include clinical characteristics to create a more personalized predictive tool.
Figure: Figure 1: Comparison of Actual and Predicted Crude Rates for Training and Testing Data
Disclosures:
Sri Harsha Boppana indicated no relevant financial relationships.
Manaswitha Thota indicated no relevant financial relationships.
Sidharth Mahajan indicated no relevant financial relationships.
Sachin Sravan Kumar Komati indicated no relevant financial relationships.
C. David Mintz indicated no relevant financial relationships.
Sri Harsha Boppana, MBBS1, Manaswitha Thota, MD2, Sidharth Mahajan, MBBS3, Sachin Sravan Kumar Komati, MSc4, C. David Mintz, MD, PhD5. P1925 - Advancing Lower GI Cancer Projections: Mortality Trends Analysis and LSTM Model Predictions, ACG 2024 Annual Scientific Meeting Abstracts. Philadelphia, PA: American College of Gastroenterology.