Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics

Kavli Affiliate: Zheng Zhu

| First 5 Authors: Shuqi Yang, Mingrui Jing, Shuai Wang, Jiaxin Kou, Manfei Shi

| Summary:

This study reviewed the use of Large Language Models (LLMs) in healthcare,
focusing on their training corpora, customization techniques, and evaluation
metrics. A systematic search of studies from 2021 to 2024 identified 61
articles. Four types of corpora were used: clinical resources, literature,
open-source datasets, and web-crawled data. Common construction techniques
included pre-training, prompt engineering, and retrieval-augmented generation,
with 44 studies combining multiple methods. Evaluation metrics were categorized
into process, usability, and outcome metrics, with outcome metrics divided into
model-based and expert-assessed outcomes. The study identified critical gaps in
corpus fairness, which contributed to biases from geographic, cultural, and
socio-economic factors. The reliance on unverified or unstructured data
highlighted the need for better integration of evidence-based clinical
guidelines. Future research should focus on developing a tiered corpus
architecture with vetted sources and dynamic weighting, while ensuring model
transparency. Additionally, the lack of standardized evaluation frameworks for
domain-specific models called for comprehensive validation of LLMs in
real-world healthcare settings.

| Search Query: ArXiv Query: search_query=au:”Zheng Zhu”&id_list=&start=0&max_results=3

Read More