Healthcare AI Governance: Yes! Details: TBD

An abundance of enthusiasm fuels an overabundance of questions

Apr 17, 2025

I got into college because I was willing to type more than everyone else.

That’s not the only reason, of course - I was (and am) earnestly nerdy in a way colleges liked - but the fact that I applied to seventeen colleges, each with its own separate application, definitely gave me an edge. Back then, there was no Common App. Just a typewriter, White-Out, and the sheer will to keep answering the same questions over and over again.

It wasn’t a particularly elegant strategy. Just stubborn. But it worked.

Which is exactly the kind of inefficiency we should be eliminating from life in 2025. Unfortunately, this exact process is being replicated in healthcare AI governance.

Today, if you’re an AI company trying to sell into health systems or insurers, you’re essentially staring down seventeen different typewriters.

Many frameworks, little actionable guidance

A recent systematic review of journal articles related to governance identified 22 healthcare AI governance frameworks and found that “most articles briefly addressed the importance of clear objectives for AI implementation in healthcare, with only a few offering actionable guidance.”

They also found that these frameworks generally didn’t have any evidence for their use, and that most of the frameworks “tended to be geared towards large academic health systems with substantial resources and personnel…There are few actionable pathways tailored to healthcare organizations with varying levels of resources and expertise.”

The authors suggest that healthcare systems “document and evaluate current standards of care” to “contextualiz[e] the status quo”, both to understand what AI tools they actually need and how new tools would fit into the workflow. Although the authors don’t explicitly mention this, health systems need to understand how well their current processes work to understand if an AI tool will be a relative improvement. Unfortunately, this kind of information is rarely well documented or understood.

The study noted that most of the questions focused on initial risks like bias and equity, transparency, explainability, clinical validation. Few addressed ongoing monitoring and validation or patient interactions. Surprisingly, “[e]xternal product evaluation, selection, and model evaluation and validation were among the least discussed aspects in the reviewed literature, often with overlapping requirements.”

This lack of clarity has led to an abundance of questions that have to be answered in parallel, with no standard form, no clear expectations, and a governance landscape that shifts from system to system.

Vendors are basically being asked to type the same thing over and over. On seventeen different typewriters.

The authors compiled this nice table which has great ideas but is, as they mention, the operational specifics are lacking.

The authors also include this nice table to give health systems some structure about how mature their healthcare AI governance systems are. I’ve circled where I think most systems are based on my completely unscientific method of talking to a lot of people. I also have no idea what the “Target Organization” row means, and as a rural doc feel vaguely offended that I can’t make it past Level 2 (though this might have more to do with to the overachieving part of me that filled out all those college applications).

Governance: Yes! Details: Someone should figure that out

Let's be clear: we don't lack ideas about what guardrails should be in place.

The White House’s OMB Memo M-25-21 from April 3, 2025 requires governance and validation for “high-impact AI” used across federal agencies. That includes any AI that affects healthcare access or outcomes.
Insurance commissioners, have issued some of the first real compliance requirements for healthcare AI, mostly related to bias and discrimination.
Academic and industry groups, like the Coalition for Health AI (CHAI) and the Health AI Partnership, are publishing detailed guidance on what model cards and validation should include.
Laws from California, Colorado, and others are evolving and require things like alerting people (including patients) when they’re interacting with an AI system

So there’s broad consensus: governance is needed. But no one can quite agree on how to do that.

What we need: A Common App for AI

It’s time to build a common application for healthcare AI governance—just like the one that changed college admissions. A core application that captures shared requirements across risk domains: bias, validation, transparency, workflow impact. A modular set of supplemental sections, triggered only if the AI poses certain kinds of risk. Just like a college essay on "Why Emory?"—but for algorithmic fairness or language translation fidelity.

For example, a tool used by dermatologists in high-income settings, supported by specialists and validated across demographics, might require very little customization. A tool that evaluates mental health from free-text input in five languages and triages patients in emergency settings needs a completely different set of questions to target distinct risks.

Each of these can be scored from 1 (low) to 5 (high). If a system scores a 4 or 5 in any domain, it should trigger a supplemental section in the governance application. That way, vendors only answer the questions that are actually relevant to their tool and use case.

This would make governance simpler, more consistent, and fairer to vendors including those who haven’t gone through FDA clearance but are still required to provide evidence for their models. These could be built into an IRB approval-style process, which similarly require higher levels of detail for specific risk types.

Right now, we’re still in the “de-risking” phase.

Let’s get back to the point of the governance: to improve patient care, safety, efficiency, and clinician and patient experience.

Most of the current AI governance work being done today is focused on ensuring the AI doesn’t cause harm.

But lack of harm doesn’t mean it works. And knowing it works in silico doesn’t mean it works in the real world, and knowing it works in the real world doesn’t mean that it is better than the current state. This end state is where we should set our focus, especially as models change and interact, users evolve, and clinical practice changes.

We’re asking the right questions. But only up front, sometimes too many, and sometimes of the wrong tools. And then…someone has to read all of them! And make sense of them. Even though many of the answers don’t apply and therefore are likely to be meaningless.

If we could agree on a common set of core questions—and build smart, risk-based supplements—we could make governance easier for vendors, more meaningful for clinicians, and potentially more transparent and understandable for patients.

We shouldn't be implementing AI tools based on which vendors are willing to type the same answers on seventeen different typewriters. Just like the Common App revolutionized college admissions, standardizing healthcare AI governance through a common framework would make the process more efficient, fair, and ultimately better for patients.

The college application process evolved beyond typewriters and White-Out. Healthcare AI governance should too.

Machine Learning for MDs by Sarah Gebauer, MD

Discussion about this post