Agentic AI systems have been all the rage these days, with many predicting that agentic AI systems will transform the future. Let’s do a deep dive today to understand:
What is an agentic AI system?
How are agentic AI models being used now?
How do agentic AI models apply to healthcare?
What does Agentic AI mean?
Agentic AI is new, and the vocabulary around it is still evolving. I’m using Agentic AI because it’s what Andrew Ng is using and he seems like a credible source, but other sources refer to similar or more narrow concepts by the terms:
AI with tool use
Multi-agent AI
AI with support agents
Multifunctional AI
Incidentally, it’s purported that the term Multi-Agent AI Model was first used a decade ago to describe
“methods available for constructing and testing computer models of social phenomena such as religious beliefs and behaviors.” Right now, there are many people excited about its potential to help with revenue management. I’m not sure what that says about societal progress.
Ethan Mollick did a fun demonstration of what AI agents could accomplish toward creating a business in less than a minute:
Andrew Ng’s overview of Agentic AI systems
If you’re unfamiliar with agentic AI systems, I highly recommend watching Andrew Ng’s recent video presentation at Sequoia Capital, which gives an overview of how he thinks about agentic AI systems.
In short, he groups the functions of agentic AI systems into four workflows:
Agentic AI workflows
1. Reasoning/reflection - An LLM agent has critic agent to improve outputs
2. Tool use - email, web use, computer vision
3. Planning - can describe best next steps
4. Multi-agent collaboration - one LLM acts as a CEO for other LLMs
In his video, he notes that the technology for first two is more robust and is considered emerging for the second two, and that all of it can still be occasionally glitchy. This lack of reliability is especially important for agentic AI, since these systems are combining the strengths but also weaknesses of multiple models over a series of steps, meaning more opportunity for major errors.
Recent examples of agentic AI systems in the news
AI agents have actually been around for a while, but in more of the ArXiv literature and pre-prints. They’re now making their way into actual products with associated money and press, and therefore getting more attention.
Who is Devin?
I’m pretty sure the people choosing names for AI models are using the same inscrutable approaches favored by pharmaceutical companies. Cognition AI released Devin last week along with videos of Devin doing things that software engineers do, like building and deploying apps end to end, finding and fixing bugs, and training and finetuning an AI model. While Gemini, CoPilot, and others can write code, Devin can make a plan, execute it, and test it. Although Cognition is branding it as “the first AI software engineer”, my guess is most software engineers will feel differently.
Why people are excited about AI systems playing video games
Let me be clear: I do not care about video games. I occasionally do very badly at MarioKart with my kids, but in general I’d rather read a book than play a video game. However, it turns out that video games are a great way to test the skills and limits of AI systems because they often involve a series of decisions and some kind of ‘learning’ or skill enhancement. Those general principles can then be translated into a higher-stakes environment like real life.
Google DeepMind is clearly investing in agentic AI systems, as their CEO notes, and they recently collaborated with the University of British Columbia to develop a “Scalable Instructable Multiworld Agent” (SIMA) that performed well on several video games. They could follow text instructions and take action in the video games, which…doesn’t sound very exciting. However, the authors describe the underlying technology “building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks in a way that is helpful to people online and in the real world.” This likely includes applications in robots, which haven’t yet undergone the seismic shift that LLMs have over the past few years.
What do agentic AI systems have to do with medicine?
Hippocratic AI made news last week with the publication of an ArXiv preprint about Polaris, which they describe as a “constellation model” (again, the terminology is new and evolving) which sounds very similar to agentic AI to me. In Polaris, a primary agent model can use “support agents”, which are AI models that specialize in tasks like medication choice or identity verification. The primary agent is trained for patient conversations and bedside manner, and can access these other AI models to inform the conversations.
In the study, they had Polaris engage in conversations with patient actors and then had 130 physicians and 1100 nurses evaluate its responses as compared to human nurses.
As they point out in their ArXiv preprint:
“The system is designed to achieve better domain-specific interactions compared to a single general purpose LLM. The healthcare conversation domain is apt for showcasing the value of this paradigm as there are many competing objectives and requirements, including a special emphasis on safety and verification”
This safety aspect makes the Hippocratic AI work especially notable. While video games are a great technical training ground, healthcare is inherently a high-stakes domain. What Hippocratic AI showed is not just that it works, but that it can work safely. The figure below from their paper is somewhat confusing, but just focus on the “Medical Safety” aspect on the far left side. This graphic shows that Polaris performed as well as human nurses on the safety aspect when rated by by doctors and nurses.
Conclusion
AI has had several hype cycles even in the past year, but I do believe that agentic AI is the way all the AI systems will go in the very near future. Allowing AI models to access other tools allows for so much more functionality that I think we’ll quickly forget what life was like before agentic AI was the standard. There’s a great deal more research to be done before it’s incorporated into healthcare in a clinical setting, but I’m hopeful it will allow for a range of improvements, especially in areas like robotics, which could assist with some of the mundane manual tasks inherent in medical and nursing care.
Where do you see agents being of most use in the future? What are you concerned about as agentic AI becomes the norm?