P2237 - Accuracy and Helpfulness of Chat GPT Compared to Traditional Search Engines for Gastroesophageal Reflux Disease (GERD) Queries: A New Era of Information Retrieval
Staten Island University Hospital, Northwell Health Staten Island, NY
Chloe Lahoud, MD1, Mark Tawfik, DO1, Gaetano Di Pietro, MD1, Suzanne El-Sayegh, MD2, Sherif Andrawes, MD1 1Staten Island University Hospital, Northwell Health, Staten Island, NY; 2Staten Island University Hospital, Staten Island, NY
Introduction: The use of Artificial Intelligence (AI) is exponentially growing across medical specialties, offering numerous advantages over traditional search engines. This manuscript aims to fill this gap by evaluating the performance of the AI system ChatGPT in the context of Gastroesophageal Reflux Disease (GERD) related queries compared to traditional search engine responses.
Methods: We compared the AI system ChatGPT 3.5 with the traditional search engine Google. Six standardized queries on GERD topics (definition, risk factors, diagnosis, symptoms, and management) were generated using Google’s auto-completions. Responses from ChatGPT 3.5 and Google were evaluated for accuracy and helpfulness using a five-point Likert scale by three independent gastroenterology experts. Statistical analysis was performed using Python.
Results: For accuracy, Chat GPT received average ratings of 5, 4.67, and 4.67, while traditional search engines scored 4.83, 4.33, and 3.33 (p-values: 0.363, 0.105, 0.0007). For helpfulness, Chat GPT was rated 5, 4.67, and 4.83, compared to 4.33, 4, and 3.33 for traditional search engines (p-values: 0.012, 0.025, 0.0037). Evaluator 3 found Chat GPT significantly more accurate, while Evaluators 1 and 2 saw no significant difference. All three evaluators rated Chat GPT as significantly more helpful. Overall, Chat GPT received higher ratings for both accuracy and helpfulness.
Discussion: This study demonstrates that AI systems like ChatGPT 3.5 provide more accurate and useful information on GERD than traditional search engines like Google. AI can enhance patient education and decision-making, though further research is needed to confirm its reliability across other topics .
Disclosures:
Chloe Lahoud indicated no relevant financial relationships.
Mark Tawfik indicated no relevant financial relationships.
Gaetano Di Pietro indicated no relevant financial relationships.
Suzanne El-Sayegh indicated no relevant financial relationships.
Sherif Andrawes indicated no relevant financial relationships.
Chloe Lahoud, MD1, Mark Tawfik, DO1, Gaetano Di Pietro, MD1, Suzanne El-Sayegh, MD2, Sherif Andrawes, MD1. P2237 - Accuracy and Helpfulness of Chat GPT Compared to Traditional Search Engines for Gastroesophageal Reflux Disease (GERD) Queries: A New Era of Information Retrieval, ACG 2024 Annual Scientific Meeting Abstracts. Philadelphia, PA: American College of Gastroenterology.