A recent study highlighted the potential of natural language processing and large language models in extracting social determinants of health from electronic health records.
A study from researchers at Mass General Brigham examined the potential of large language models to extract social determinants of health from EHRs and improve real-world evidence. The study appeared in npj Digital Medicine.
Despite their significance, social determinants often face underdocumentation in structured EHR data, hindering comprehensive research and clinical care. Commonly found in free-text clinic notes, incorporating these critical factors into research databases poses a challenge, creating a bottleneck in understanding the impact of social determinants on health outcomes, according to the researchers.
The study examined if natural language processing could be a solution, automating the extraction of social determinant information from clinical texts. It explored optimal methods, leveraging language models to extract six categories: employment, housing, transportation, parental status, relationship, and social support.
Reseachers also addressed algorithmic bias, with findings indicating that fine-tuned models exhibit less sensitivity to demographic descriptors compared to ChatGPT-family models.
Researchers found the developed models demonstrated their efficacy in identifying patients with adverse SDoH, surpassing the capabilities of structured diagnostic codes. With the potential to improve data collection and resource allocation, these models hold promise for assisting in patient care and contributing to a deeper understanding of health disparities driven by social factors, according to researchers.