Pragmatic Evaluation of Large Language Models in Healthcare

Large Language Models (LLMs) and LLM-based tools are becoming ubiquitous in healthcare. However, there is a dearth of analyses evaluating the efficacy of these tools. This tutorial has three learning objectives for participants:

gain a rigorous conceptual understanding of how LLMs are built;
become familiar with the various frameworks available in the literature for LLM evaluation
observe three real-world case studies that demonstrate the use of these frameworks (and other open-source tools) to evaluate LLM deployments of increasing complexity.

Website under construction

Full tutorial materials will be available by July 6, 2026