Productivity Monitoring in Software Developers
AI assessments for engineers and the lessons they hold for physicians
Measuring Productivity in Software Developers
Last week we looked at some examples of AI productivity monitoring including those at retail giants like Walmart and Amazon. In case you think this focus on quantifying productivity and performance using AI only applies to entry-level employees, McKinsey recently published an article titled “Yes, you can measure developer productivity” describing the new software they’ve implemented at 20+ companies to measure software developer productivity. They claim the performance tracking software has led to:
“20 to 30 percent reduction in customer-reported product defects
20 percent improvement in employee experience scores
60-percentage-point improvement in customer satisfaction ratings”
If you think similar software will never happen in medicine, I think you’re delusional.
The article describes the traditional difficulty of tracking developer metrics, many of which are analogous to physician metrics (bolded text is mine):
“Software development is also highly collaborative, complex, and creative work and requires different metrics for different levels (such as systems, teams, and individuals). What’s more, even if there is genuine commitment to track productivity properly, traditional metrics can require systems and software that are set up to allow more nuanced and comprehensive measurement.”
In other words, software development, like medicine, is complicated, requires a lot of people working together, and is hard to measure.
I’m not going to argue about whether this is a good or bad way to measure developer productivity; there are engineers who know a lot more about this and describe the downside to the McKinsey report. Here I focus on some concepts and frameworks that are applicable to medicine.
Inner loop vs outer loop activities
One interesting framework is their concept of inner loop vs outer loop activities
A comparable model in medicine would have:
What I like about this framework is that it identifies what activities both improve productivity and provide more fulfillment to help developers do more of those activities. Imagine a health system that tried to maximize physicians’ inner loop activities and minimize their outer loop!
Using industry standard metrics
The report combines two standard software developer industry metrics, DORA and SPACE. DORA was developed by the DevOps team at Google, and SPACE from researchers at GitHub, Microscoft, and others.
McKinsey’s productivity framework
McKinsey uses these standard metrics and adds additional metrics related to opportunity with catchy names like “contribution analysis” and “talent capability score”. Contribution analysis is measured by looking at how much each engineer contributes to the team’s workload - this is easier in project-based areas, harder in healthcare. Talent capability score is how much of an individual or team’s skills line up with what the organization needs.
Defining what practices and healthcare systems actually need from physicians would require serious thought from practice and system leadership. Do they want strong non-clinical leadership? Clinical reasoning ability? Clinical team management? Academic publishing frequency? I am willing to bet that the vast majority of health systems haven’t identified what skills and knowledge they need in their physicians other than “see patients”, and therefore they can’t train, upskill, or cultivate those skills appropriately.
I just want to pause to say that if minimizing physician interruptions were made a focus, it seems very likely that medical errors would decrease. For obvious emergency reasons, the system is actually designed to maximize physician interruptions via texts, Secure Chat, and pagers.
The concept I like from this chart is the explicit focus on measuring productivity at a personal, team, and system level. This makes sense in medicine because so much of physician productivity and performance are tied to the team and system.
From the team perspective:
The OR scheduler
The office manager
Whether a physician has an advanced practice provider on the team
From the system perspective:
Marketing for services
Administrative and documentation burden
Non-clinical and meeting-related burden
Our current system views physician productivity and performance in isolation; having a mechanism to evaluate how these other factors affect them could be incredibly helpful.
Misuse of metrics
The report cautions against misuse of metrics, including:
Overly simplified metrics. In software development this might be using a metric like lines of code written, which can lead to gaming the system by writing code that’s longer than it needs to be. In medicine, I’d argue that this describes the current state in medicine. From RVUs to quality metrics, we’ve often been limited by measuring what we can instead of what we should.
Not trying to measure anything because it seems too complicated. I think this is more of a risk for physicians than for hospital systems. If we as a profession don’t come up with ways to help hospitals figure out how to measure productivity and performance other than RVUs, I’m confident that health systems will find ways to do that for us.
Summary
Any measurement system, including this one by McKinsey, is going to have pros and cons. Developing them thoughtfully and keeping the ultimate goal in mind goes a long way. In reading the literature about software developer productivity, I’ve been impressed by:
How closely tied performance and productivity are assumed to be
Ie, a happy engineer will do better work more quickly
This concept mirrors findings in the medical literature that part-time physicians are more productive that burnout makes you less productive
The focus on engineer working environment and high-yield use of engineer time.
Next week, we’ll look at how AI can be used to assess physician clinical performance including in medical education. The fourth week will focus on the future and best/worst case scenarios for AI in physician monitoring and evaluation.
If you’re a physician, join us at the ML for MDs Slack group, where we share resources and knowledge about the intersection of AI in healthcare.
“minimizing physician interruptions were made a focus, it seems very likely that medical errors would decrease”
I agree!