Retrieval-augmented generation for personalized physician recommendations in online medical services: model development study

Yingbin Zheng; Yiwei Yan; Sai Chen; Yunping Cai; Kun Ren; Yishan Liu; Jiaying Zhuang; Min Zhao

doi:10.71070/oaml.v5i1.141

Vol. 5 No. 1 (2025): Issue 5

Articles

Retrieval-augmented generation for personalized physician recommendations in online medical services: model development study

PDF

Yingbin Zheng,
Yiwei Yan,
Sai Chen,
Yunping Cai,
Kun Ren,
Yishan Liu,
Jiaying Zhuang,
Min Zhao

more info

Yingbin Zheng
Biomedical Big Data Center, The First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, China

Yiwei Yan
Biomedical Big Data Center, The First Affiliated Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, China

Sai Chen
Meteorological Disaster Prevention Technology Center, Xiamen Meteorological Bureau, Xiamen, China

Yunping Cai
Meteorological Disaster Prevention Technology Center, Xiamen Meteorological Bureau, Xiamen, China

Kun Ren
Meteorological Disaster Prevention Technology Center, Xiamen Meteorological Bureau, Xiamen, China

Yishan Liu
School of Software Engineering, Taiyuan University of Technology, Taiyuan, China

Jiaying Zhuang
School of Software Engineering, Taiyuan University of Technology, Taiyuan, China

Min Zhao
School of Software Engineering, Taiyuan University of Technology, Taiyuan, China

DOI: https://doi.org/10.71070/oaml.v5i1.141

Published 2025-03-05

Keywords

large language models,
mistral, SBERT,
triage systems,
retrievalaugmented generation-based physician recommendation,
RAGPR model

How to Cite

Zheng, Y., Yan, Y., Chen, S., Cai, Y., Ren, K., Liu, Y., … Zhao, M. (2025). Retrieval-augmented generation for personalized physician recommendations in online medical services: model development study. Optimizations in Applied Machine Learning, 5(1). https://doi.org/10.71070/oaml.v5i1.141

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Abstract

Web-based medical services have expanded access to healthcare through remote consultations and streamlined scheduling, but personalized physician recommendations remain limited due to reliance on manual triage. This study developed and validated a Retrieval-Augmented Generation-Based Physician Recommendation (RAGPR) model to enhance triage performance. Using 646,383 consultation records from the Internet Hospital of the First Affiliated Hospital of Xiamen University, we evaluated embedding models (FastText, SBERT, OpenAI) for clustering and classification, as well as large language models (Mistral, GPT-4o-mini, GPT-4o). Three triage staff also assessed model efficiency via questionnaires. Results showed that FastText performed poorly (F1-score 46%), while SBERT and OpenAI achieved 95% and 96%. Among LLMs, GPT-4o reached the highest F1-score (95%) with a performance rating of 4.67, followed by Mistral (94%, 4.56) and GPT-4o-mini (92%, 4.45). Considering accuracy, cost, and implementation, SBERT and Mistral were optimal. The RAGPR model offers a scalable approach to improving accuracy and personalization in online patient– physician matching.

PDF

References

Nordstrand AE, Anyan F, Bøe H, Hjemdal O, Noll L, Reichelt J, et al. Problematic anger among military personnel after combat deployment: prevalence and risk factors. BMC Psychol. (2024) 12:451. doi: 10.1186/s40359-024-01955-8
Sumar K, Blue L, Fatahi G, Sumar M, Alvarez S, Cons P, et al. The effect of adding physician recommendation in digitally-enabled outreach for COVID-19 vaccination in socially/economically disadvantaged populations. BMC Public Health. (2024) 24:1933. doi: 10.1186/s12889-024-18648-x
Brindisino F, Girardi G, Crestani M, et al. Rehabilitation in subjects with frozen shoulder: a survey of current (2023) clinical practice of Italian physiotherapists. BMC Musculoskelet Disord. (2024) 25:573. doi: 10.1186/s12891-024-07682-w
Rui JR, Guo J, Yang K. How do provider communication strategies predict online patient satisfaction? A content analysis of online patient-provider communication transcripts. Digit Health. (2024) 10:20552076241255617. doi: 10.1177/20552076241255617
Wetzel AJ, Koch R, Koch N, Klemmt M, Müller R, Preiser C, et al. 'Better see a doctor?' status quo of symptom checker apps in Germany: a cross-sectional survey with a mixed-methods design (CHECK.APP). Health. (2024) 10:20552076241231555. doi: 10.1177/20552076241231555
Iranzad R, Liu X, Dese K, Alkhadrawi H, Snoderly H, Bennewitz M. Structured adaptive boosting trees for detection of multicellular aggregates in fluorescence intravital microscopy. Microvasc Res. (2024) 156:104732:104732. doi: 10.1016/j.mvr.2024.104732
Herr K, Lu P, Diamreyan K, Xu H, Mendonca E, Weaver KN, et al. Estimating prevalence of rare genetic disease diagnoses using electronic health records in a children's hospital. HGG Adv. (2024) 5:100341. doi: 10.1016/j.xhgg.2024.100341
Lilli L, Bosello SL, Antenucci L, Patarnello S, Ortolan A, Lenkowicz J, et al. A comprehensive natural language processing pipeline for the chronic lupus disease. Stud Health Technol Inform. (2024) 316:909–13. doi: 10.3233/SHTI240559
Bonomo M, Rombo SE. Neighborhood based computational approaches for the prediction of lncRNA-disease associations. BMC Bioinformatics. (2024) 25:187. doi: 10.1186/s12859-024-05777-8
Chew LJ, Haw SC, Subramaniam S. A hybrid recommender system based on data enrichment on the ontology modelling. F1000Res. (2021) 10:937. doi: 10.12688/ f1000research.73060.1
Abdullahi T, Mercurio L, Singh R, Eickhoff C. Retrieval-based diagnostic decision support: mixed methods study. JMIR Med Inform. (2024) 12:e50209. doi: 10.2196/50209
Yazaki M, Maki S, Furuya T, Inoue K, Nagai K, Nagashima Y, et al. Emergency patient triage improvement through a retrieval-augmented generation enhanced large- scale language model. Prehosp Emerg Care. (2024) 400:1–7. doi: 10.1080/10903127.2024.2374400
Gargari OK, Fatehi F, Mohammadi I, Firouzabadi S, Shafiee A, Habibi G. Diagnostic accuracy of large language models in psychiatry. Asian J Psychiatr. (2024) 100:104168. doi: 10.1016/j.ajp.2024.104168
Arun G, Perumal V, Urias F, Ler Y, Tan B, Vallabhajosyula R, et al. ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: a comparative pilot study. Anat Sci Educ. (2024) 17:1396–405. doi: 10.1002/ase.2502
Tabaie A, Tran A, Calabria T, Bennett S, Milicia A, Weintraub W, et al. Evaluation of a natural language processing approach to identify diagnostic errors and analysis of safety learning system case review data: retrospective cohort study. J Med Internet Res. (2024) 26:e50935. doi: 10.2196/50935
Sharif S, Ghouchan R, Abbassian H, Eslami S. Comparison of regression methods to predict the first spike latency in response to an external stimulus in intracellular recordings for cerebellar cells. Stud Health Technol Inform. (2024) 316:796–800. doi: 10.3233/SHTI240531
Santander-Cruz Y, Salazar-Colores S, Paredes-Garcia WJ, et al. Semantic feature extraction using SBERT for dementia detection. Brain Sci. (2022) 12:270. doi: 10.3390/ brainsci12020270
Izzidien A, Fitz S, Romero P, et al. Developing a sentence level fairness metric using word embeddings. Int J Digit Humanit. (2022) 10:1–36. doi: 10.1007/ s42803-022-00049-4
Oh J, Park H. Effects of changes in environmental color Chroma on heart rate variability and stress by gender. Int J Environ Res Public Health. (2022) 19:711. doi: 10.3390/ijerph19095711
Santana EFM, Araujo JE. Realistic Vue: a new three-dimensional surface rendering approach for the in utero visualization of embryos and fetuses. Radiol Bras. (2019) 52:172–3. doi: 10.1590/0100-3984.2018.0050
Jolley KA, Bray JE, Maiden MCJ. A RESTful application programming interface for the PubMLST molecular typing and genome databases. Database. (2017) 2017:60. doi: 10.1093/database/bax060
Wang H, Gao C, Dantona C, Hull B, Sun J. DRG-LLaMA: tuning LLaMA model to predict diagnosis-related group for hospitalized patients. NPJ Digit Med. (2024) 7:16. doi: 10.1038/s41746-023-00989-3
Tai ICY, Wong ECK, Wu JT, et al. Exploring offiine large language models for clinical information extraction: a study of renal histopathological reports of lupus nephritis patients. Stud Health Technol Inform. (2024) 316:899–903. doi: 10.3233/ SHTI240557
Endalie D, Haile G, Taye W. Deep learning-based idiomatic expression recognition for the Amharic language. PLoS One. (2023) 18:e0295339. doi: 10.1371/journal. pone.0295339
Saito Y, Itakura K, Ohtake N, et al. Classification of soybean chemical characteristics by excitation emission matrix coupled with t-SNE dimensionality reduction. Spectrochim Acta A Mol Biomol Spectrosc. (2024) 322:124785. doi: 10.1016/j. saa.2024.124785
Clements F, Vedam H, Chung Y, et al. Patient preference of level I, II and III sleep diagnostic tests to diagnose obstructive sleep apnoea among pregnant women in early to mid-gestation. Sleep Breath. (2024) 28:2387–95. doi: 10.1007/s11325-024-03114-0
Shu D, Zou G. Sample size planning for estimating the global win probability with precision and assurance. Contemp Clin Trials. (2024) 146:107665. doi: 10.1016/j. cct.2024.107665
Muayad J, Loya A, Hussain ZS, Chauhan M, Alsoudi A, de T, et al. Comparative effects of glucagon-like peptide 1 receptor agonists and metformin on glaucoma risk in patients with type 2 diabetes. Ophthalmology. (2024) 23:S0161–6420. doi: 10.1016/j. ophtha.2024.08.023
Bertò G, Rooks LT, Broglio SP, McAllister T, McCrea M, Pasquina P, et al. Diffusion tensor analysis of white matter tracts is prognostic of persisting post-concussion symptoms in collegiate athletes. Neuroimage Clin. (2024) 43:103646:103646. doi: 10.1016/j.nicl.2024.103646
Pardo E, Le Cam E, Verdonk F. Artificial intelligence and nonoperating room anesthesia. Curr Opin Anaesthesiol. (2024) 37:413–20. doi: 10.1097/ ACO.0000000000001388
Gottardelli B, Gatta R, Nucciarelli L, Tudor A, Tavazzi E, Vallati M, et al. GEN- RWD sandbox: bridging the gap between hospital data privacy and external research insights with distributed analytics. BMC Med Inform Decis Mak. (2024) 24:170. doi: 10.1186/s12911-024-02549-5
Wyatt KD, Minard-Colin V, Schleiermacher G, Willi M, Volchenboum S. GDPR and data sharing: the pediatric Cancer data commons experience. Lancet Oncol. (2024) 25:e227. doi: 10.1016/S1470-2045(24)00250-X
Zhaoyan Zhang, Yu Qiao, & Peimin Lu. (2024). Self-Reflective Retrieval-Augmented Framework for Reliable Pharmacological Recommendations. Journal of Computational Methods in Engineering Applications, 4(1), 1–12. https://doi.org/10.62836/jcmea.v4i1.040108

Retrieval-augmented generation for personalized physician recommendations in online medical services: model development study

Keywords

How to Cite

Download Citation

Abstract

References