From Prompts to Context
The next phase of AI isn’t about clever wording. It’s about teaching systems to understand the world they’re in.
Remember a few years ago when everyone realized better prompts could make AI give better answers then gave it the complicated-sounding name of “prompt engineering”?
As models have grown more powerful and general, the frontier of confusingly technical wording has shifted again. What matters most now isn’t how we prompt models, but how we prepare their context. In other words, the next great challenge in AI isn’t getting the model to talk, but helping it understand what it’s talking about.
https://www.gartner.com/en/articles/context-engineering
This shift from prompt engineering to context engineering is reshaping how every major sector is building and evaluating AI systems. In finance, banks are creating dynamic “context vaults” that feed models with real-time market data and compliance constraints. In law, firms are constructing retrieval systems that pre-load models with relevant case law and client histories so the output reflects precedent, not just probability. Even the defense industry now tests AI systems in simulated battle environments, measuring not only accuracy but performance under stress, ambiguity, and imperfect information. The core insight is the same across all of them: the model’s output is only as good as the context it sees.
What is context engineering?
Context engineering is the deliberate design of the information environment around an AI system—everything that shapes how it interprets, retrieves, and applies knowledge. It involves curating relevant data, structuring inputs, defining environmental signals, and managing the feedback loops that connect AI outputs back to human users. In large language models, it often means building retrieval-augmented systems that pull verified information into the model’s context window before it answers a question. In other settings, it means designing the operational scaffolding: how an alert is triggered, how it’s displayed, who reviews it, and what happens next.
https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
This idea matters because context determines not just what AI knows, but how it reasons. A radiology model can be 98% accurate in detecting lung nodules but still fail patients if it’s deployed without knowing the clinical question, prior imaging, or patient history. The same logic applies outside medicine: a self-driving car can interpret a stop sign correctly 99.9% of the time, yet if it misreads one during a snowstorm or next to a construction zone, the result can be catastrophic. The point isn’t that AI fails, but that it fails differently depending on context, and that’s what needs to be measured.
Context Engineering vs. System-Level Orchestration
The terms context engineering and system-level orchestration are closely related and sometimes used interchangeably, but they describe different layers of how modern AI systems work.
Context engineering, as Anthropic and others use the term, focuses on what happens inside the model’s head. It’s the process of structuring and managing the information that the model sees at the moment it’s asked to reason—what’s loaded into its “working memory.” That includes which documents are retrieved, how they’re summarized, how the question is phrased, and what examples or constraints are provided. The goal is to make sure the model’s internal context window contains the most relevant, accurate, and well-organized information possible for the task at hand. It’s like briefing a specialist before a complex case: you can’t change their training, but you can decide what information they have in front of them.
System-level orchestration operates one layer higher. It’s about how multiple agents, tools, and data sources coordinate around that model—how information is retrieved, routed, stored, and updated across the system. It governs when and how the model is called, how its outputs are validated, and how the results feed back into the workflow. In this sense, orchestration is less about the contents of the model’s mind and more about the design of its environment. It’s the control tower, not the cockpit.
In practice, the two blend together. Effective orchestration depends on good context engineering: the system must know which information to surface and when. And strong context engineering often relies on orchestration: retrieval pipelines, shared memory stores, and feedback loops that make sure the right context is available at the right time. That’s why many organizations now use the terms together. But in healthcare, the distinction matters. Context engineering governs what clinical information an AI tool uses to make a judgment; orchestration governs how that tool fits into the broader ecosystem of clinicians, data systems, and decision pathways. Both must work together to ensure the AI is not only smart, but clinically coherent.
The rise of context engineering across industries
Instead of focusing on the wording of a single prompt, developers now build pipelines that shape what information the model sees, how it reasons through multi-step problems, and how it integrates external tools or databases. Enterprises are hiring “context architects” to manage these flows, treating context like data infrastructure rather than user input.
Even Magic Schools have Context Engineers!
Google’s DeepMind applies similar principles in scientific research. When training models to predict protein structures or optimize materials, they integrate domain constraints, simulation feedback, and expert verification into the learning context itself. These are not peripheral details. They’re fundamental to making the models work in the real world.
Context as the Missing Variable in Healthcare AI
In medicine, context is everything. A lab value means one thing for a healthy twenty-year-old and something entirely different for an eighty-year-old in the ICU. A slightly elevated heart rate might signal anxiety in one patient and impending shock in another. Clinicians interpret every data point against a web of circumstances—what came before, what’s happening now, and what might happen next. Yet most AI models still treat data as if it exists in isolation.
Context engineering asks how an AI system perceives, interprets, and adapts to its environment. Can it recognize when the data are incomplete? Can it adjust when a hospital uses a different EHR interface or staffing model? Can it learn that what matters in a rural clinic is not the same as what matters in a tertiary trauma center? A well-engineered context layer allows AI to handle ambiguity and variation, also known as real life, without collapsing into error.
This shift also reflects a broader policy evolution. The OMB Memorandum M-25-21 (Accelerating AI Use Through Innovation), released earlier this year, calls for “context-specific risk management” for all high-impact AI systems, including those used in healthcare. The concept is that AI should be evaluated not only for what it can do in principle but for how it behaves in practice, within the social and operational environment it inhabits. That principle, though not the most elegantly phrased, captures what clinicians already know instinctively, which context defines safety.
To operationalize that, healthcare needs evaluation frameworks that go beyond static test sets and into scenario-based testing. A sepsis alert, for example, should be validated under realistic conditions: delayed lab inputs, staffing shortages, noisy data, and competing alerts. A radiology model should be tested across different hospitals with varying equipment and image quality. Chat-based assistants should be judged not just on factual accuracy but on their ability to maintain continuity through handoffs, the same way a clinical team must during shift change.
Treating context seriously also means building it into governance. Vendors could describe the contextual boundaries of their models like what settings they’ve been tested in, what assumptions they make about users and data flow, and where performance begins to degrade. Regulators and insurers could then evaluate whether those assumptions still hold in the intended environment. This would also make more sense than questions about training data that we’re currently asking, since most healthcare AI now are agents or multiple models, and many include some of the large proprietary AI company training sets.
The move toward context engineering recognizes that intelligence without situational awareness isn’t enough. A model can be perfectly accurate and still unsafe if it doesn’t understand where, how, and by whom it’s being used. Evaluating AI through the lens of context, including its adaptability, resilience, and integration with human decision-making, shifts the focus from theoretical performance to practical reliability.
If the last decade of AI was about making models smarter, the next one must be about making systems wiser. In healthcare, that means building tools that grasp the nuance of the situations they’re meant to support. Metrics will always matter, but context is what turns good numbers into good medicine.
If you’re working on healthcare AI and want help making it safer, smarter, or more clinically grounded, that’s what I do. I work with companies and investors on evaluation, governance, and clinical integrity. You can reach me through Validara Health or by email sarah (at) validarahealth.com.





Love this: If the last decade of AI was about making models smarter, the next must be about making systems wiser. There’s a compelling parallel here: advancing AI toward wisdom while also striving for greater wisdom in humanity itself. The real question is, can we achieve both?