Epic leads new effort to democratize health AI validation

Rubi Redner May 29, 2024

2 5 minutes read

Epic leads new effort to democratize health AI validation

Epic this past week announced the availability of new software that could help hospitals and health systems assess and validate artificial intelligence models.

Aimed at healthcare organizations that might otherwise lack resources to properly validate their AI and machine learning models, the tool – which is open source and freely available on GitHub – is designed to help providers make decisions based on their own local data and workflows.

Epic is working with the Health AI Partnership and data scientists at Duke University, the University of Wisconsin and other organizations to test the “seismometer,” to develop a shared, standardized language.

The suite of tools could validate AI models that improve patient care, elevate health equity and prevent model bias, according to Corey Miller, Epic’s vice president of research and development.

We spoke recently with Miller – along with Mark Sendak, population health and data science lead at Duke Institute for Health Innovation and a leader of the Health AI Partnership (HAIP), and Brian Patterson, UW Health’s medical informatics director for predictive analytics and AI – to learn more about the software and how healthcare organizations can use it.

The three described how the open-source tool can help with provider workflows and clinical use cases, plans for analyzing uses, contributions and enhancements, and how open-source credibility lends itself to scaling the use of AI in healthcare.

A ‘funnel’ that uses local data

One major potential benefit of the validation tool, said Miller, is the ability to use it to drill into data and find out why a “protected class isn’t getting as great outcomes as other people” and learn which interventions may improve patient outcomes.

The seismometer – Epic’s first open-source tool – is designed so any healthcare organization can use it to evaluate any AI model, including homegrown models, against local population data, he said. The suite uses standardized evaluation criteria with any data source – any electronic health record (EHR) or risk management system, said Miller.

“The data schema and funnel just take in data from any source,” he explained. “But standardizing the way you pull the data out of the system, it gets ingested and put into this notebook, which is effectively the data you can run code against.”

The resulting dashboards and visualizations are “gold standard tools” already used to evaluate AI models in healthcare settings.

Epic does not get any user data, as the intention is to run validation locally, but the EHR vendor’s developers and quality assurance staff will review any code suggested for addition via GitHub.

Open source to build trustworthy AI

While the tool relies on technology Epic has developed over many years, Miller said it took about two months for open-sourcing and building additional components, data schema and notebook templates.

During that time, he said Epic worked with data scientists and clinicians at several healthcare organizations to test the suite on their own local predictions.

The goal is to “help with a real-world problem,” he said.

One tool in the seismometer suite, called the Fairness Audit, is based on an audit toolkit developed by the University of Chicago and Carnegie Mellon to score a model’s fairness across different protected classes and demographic groups, Miller said.

“Most healthcare organizations today do not have the capabilities or personnel for local model testing and monitoring,” Sendak added.

In December at the ONC 2023 Annual Meeting, Sendak and Jenny Ma, a senior advisor in the Health and Human Services Office for Civil Rights, said – in a session focused on addressing racial bias in AI – that it became clear during the COVID-19 pandemic that healthcare resources were being allocated unfairly.

“It was a very startling experience to see first-hand how poorly equipped not only Duke was but many health systems in the country to meet low-income, marginalized populations,” Sendak said.

While HAIP and many other health institutions have been validating AI, Sendak said this new AI validation tool offers a “standard set of analysis that now will be much more broadly accessible” to numerous other organizations.

“It’s an opportunity to really diffuse the best practice by giving folks the tooling,” he said.

The University of Wisconsin will be working with HAIP, a multi-stakeholder group comprising 10 healthcare organizations and four ecosystem partners that joined for peer learning and collaboration to create guidance for using AI in healthcare, and the community of users to test the open-source tools and make those “apples to apples” comparisons.

“Even though we do have a team of data scientists and we’re in one of these well-resourced places, having tools that make it easier benefits everyone,” said Patterson.

Having the tools for standard processes “would make our lives easier,” but also the engaged community of users validating Epic’s open-source tool together “is one of the things that’s going to build trust among end users,” he added.

Comparing across organizations

Patterson said the University of Wisconsin team has not picked specific use cases to test with the seismometer, but the plan is to start with the simpler AI models they use.

“None of the models are super simple, but we have a range of models that we’re running from Epic and some of the ones that our research teams have developed,” he said.

Those that “run on fewer inputs, and specifically models that output a ‘yes, no,’ this condition exists or doesn’t, are good ones in which we can generate some early statistics.”

Sendak said HAIP is considering a shortlist of models for its first evaluation study, which looks to improve the usability of the tools in community and rural settings that are part of its technical assistance program.

“All of the models that we’re looking at involve some amount of localized retraining to the model parameters,” he explained.

“We’re going to be able to look at, What does the off-the-shelf model perform like at Duke and the University of Wisconsin? Then, after we conduct the localization where we train on local data to update the model, we’ll be able to say, ‘OK, how does this localized version compare now across the sites?'”

“I think these tools are going to be most effective in the end on models that are fairly complex,” Patterson added. “And the ability to do that with less data science resources at your disposal democratizes that process and hopefully expands that community quite a bit.”

AI validation for compliance

Sendak said the tools could help provider organizations ensure fairness and find out where they must improve, noting that they have 300 days to comply with new nondiscrimination rules.

“They have to do risk mitigation to prevent discrimination,” he said. “They’ll be held liable for discrimination that results from the use of algorithms.”

The Section 1557 nondiscrimination rule, finalized this past month by OCR, applies to the range of healthcare operations from screening and risk prediction to diagnosis, treatment planning and allocation of resources. The rule adds telehealth and some AI tools and protects more information that could make providers liable for discrimination in healthcare.

HHS said there were more than 85,000 public comments on nondiscrimination in health programs and activities.

A new, free 12-month technical assistance program through HAIP will help five sites implement AI models, Sendak noted.

“We know that the magnitude of the problem of 1,600 federally qualified health centers, 6,000 hospitals in the United States, it’s a huge scale at which we have to rapidly diffuse expertise,” he explained.

The HAIP Practice Network will support organizations like FQHCs and others lacking data science capabilities. Applications are due June 30.

Those selected will adopt best practices, contribute to the development of AI best practices and help assess AI’s impact on healthcare delivery.

“That’s where we see a huge need for tools and resources to support local validation of AI models,” said Sendak.

Andrea Fox is senior editor of Healthcare IT News.
Email: [email protected]
Healthcare IT News is a HIMSS Media publication.