A paper published by the Health AI Partnership (HAIP) last year interviewed 89 healthcare professionals about trends in healthcare AI and found that:
“The participants pointed out that business administrators who do not have sufficient knowledge in patient care were typically the ones who evaluated AI algorithms.”
With that in mind, I’ve created a wishlist of what information to give clinicians about healthcare AI software.
Provide metrics that physicians understand
All physicians have had basic statistics and have “grown up” learning about sensitivity, specificity, NNT and PPV. As the HAIP paper describes:
“Sensitivity, specificity, area under the receiver operating characteristic curve, positive predictive value number, and false positive and negative rate were commonly referenced by interviewees as model performance metrics.”
I’m a little surprised the respondents listed AUC, which is a common performance metric in AI but with which most physicians would be unfamiliar. This likely just reflects the sample pool of interviewees has more experience with AI than your average clinician.
Similarly, physicians want to avoid using tools that are biased or using them in groups that would provide inaccurate information. Give us that information in clear language (ie, without using AUC). The model cards that have been proposed are a great start, and making them even more user-friendly and clear should be a priority for healthcare AI developers. Being able to use the tools in the correct patients in the correct settings in the correct way benefits not only patients but also ultimately the AI developers; creating a product that works the way it’s supposed to is just good business.
Tell us how it compares to baseline
Benchmarks are common in model evaluation as one approach to distinguish performance amongst models. That baseline might be the success rate of current practice or a guideline. Many clinical AI tools use guidelines as a baseline for their algorithms; if that’s the case, clearly note that, since it will make the physicians more comfortable to use guidelines rather than the black box of AI. If it’s not using a guideline, tell us at which points it differs from accepted specialty guidelines. We are going to be held legally liable for what we do based on the AI recommendation, so don’t make us pull up papers to figure out if then justify why we’re doing something that is contrary to accepted clinical practice.
Tell us how it how well it works in silico vs the real world
There are lots of tools that work great in theory but in practice perform somewhat lower than what you’d expect. Initial studies on the effect of AI on radiology diagnostic accuracy demonstrate that how the AI is used is just as important, if not more than, its ability to provide correct answers. The field of human-AI interactions is likely to mature in the coming years, hopefully providing data to better inform when and where to use AI.
Make it intuitive enough that I don’t need to do another online learning session
Raise your hand if you’ve had the (somewhat depressing) realization that your toddlers have very easily figured out how to use their iPhone with their chubby, dirty little fingers.
In contrast, I’m guessing almost no one has used hospital software that was so beautifully designed and tested that their toddler could navigate through the screens seamlessly. If you’re reading this and are not a physician, make a mental note never to ask your doctor friends about EHR training unless you want to hear a long diatribe about hours of their lives wasted while inbox messages piled up.
The cognitive demands and interruptions that physicians deal with is high, and adding another complex step to the process is likely to impede patient care. Creating tools that can be used quickly, without a high ‘start-up cost’ every time we use it, will improve care all around.
Tell me how you’re going to use my data
Are you going to use my prescription patterns to package into a dataset that you are going to resell to insurers? Most ethical frameworks recommend that patients should be told if their data are being used to train AI systems, but the rules for clinicians are not nearly as clear. Please be upfront about the ways you plan to use the data you gather.
Note: I’ll be traveling with my husband and four school-age kids for several months and will pause the articles while we’re away so I can focus on them (while the kids still like hanging out with their parents!)
In the meantime, I’d love to hear ideas about what topics you want to see covered in the future. Email machinelearningformds@gmail.com and let me know!