The Rise of Just-in-Time AI: Maximizing Impact, Minimizing Costs

Generative AI (Gen AI) is revolutionizing workflows by being strategically deployed at critical moments, maximizing effectiveness while keeping operational costs under control. This approach, often referred to as "just-in-time" AI, is drawing a mix of enthusiasm and skepticism from CIOs and IT leaders.

What Is Just-in-Time AI?

Just-in-time AI borrows its concept from the manufacturing sector, particularly the Japanese Kanban system, which focused on efficiency and timely execution. In the context of AI, this approach involves activating generative AI models precisely when they are needed, avoiding unnecessary costs and ensuring relevant real-time insights.

Sastry Durvasula, Chief Information and Client Services Officer at TIAA, explains how their "Research Buddy" AI tool exemplifies this principle. Used by Nuveen, TIAA's asset management arm, Research Buddy delivers insights from public documents only when requested, enhancing efficiency and relevance.

“The timeliness is critical. You don’t want to do the work too much in advance because you want that real-time context. We activate the AI just in time,” says Durvasula.

This strategy not only optimizes workflow but also addresses the high costs associated with generative AI processing.

The Cost Factor

Generative AI can be expensive to deploy, especially when used indiscriminately. Durvasula warns, "The cost of AI can be astronomically high and not always justified in terms of business value."

However, as Forrester analyst Mike Gualtieri points out, the cost of AI should be evaluated in context. For high-stakes scenarios where significant financial outcomes are involved, the investment in Gen AI may pale in comparison to the value it generates.

"If it costs you a million dollars and saves you $10 million, then cost should not hold you back," Gualtieri argues.

The key, he suggests, lies in knowing when cost should be a factor and when it should not, particularly when leveraging pre-trained large language models (LLMs) and retrieval-augmented generation (RAG) services.

Techniques to Reduce Costs

RAG services are a powerful tool to manage costs while improving the quality and relevance of generative AI outputs. These services allow enterprises to inject relevant data into pre-trained LLMs at the moment of need, avoiding expensive model training and over-reliance on high-cost data science talent.

“Vendors are providing built-in RAG solutions so enterprises won’t have to build them themselves,” notes Gualtieri. For example, Google’s RAG service allows businesses to integrate real-time data with pre-trained models seamlessly.

Case Study: SAIC’s Tenjin GPT

One standout example of just-in-time AI in action is SAIC’s Tenjin GPT, a generative AI platform deployed across its 24,000 employees. Built on Microsoft Azure and OpenAI, the platform is used to enhance workflows strategically, including:

IT service incident resolution
Customer service inquiries
AI-assisted software development
Data preparation and visualization

SAIC’s CIO, Nathan Rogers, emphasizes that this initiative aims to empower employees with AI-enabled tools that allow for timely, data-driven decision-making. "We will ultimately have citizen developers throughout the whole company who can get to a decision-making just-in-time moment," he states.

Challenges and Considerations

While just-in-time AI has clear advantages, it comes with its own challenges, including the high computational demands of generative AI models and the difficulty of ensuring bias-free, reliable outputs without a human-in-the-loop (HITL).

Max Chan, CIO of Avnet, critiques the term "just-in-time," suggesting that it might be better described as using the "right technique in the right places" to balance costs and efficiency.

Durvasula adds that responsible AI governance must be embedded in the system to ensure ethical and effective outcomes. In TIAA’s case, Nuveen analysts validate Research Buddy’s results before they are acted upon, providing an additional layer of quality assurance.

The Just-in-Case Perspective

For some workflows, "just-in-case" AI might be more appropriate. In scenarios like investment-driven decision-making, having insights readily available — even if not immediately needed — can be invaluable. Durvasula highlights the need for real-time personalization and low latency for such high-value use cases.

Lessons from Japanese Efficiency

Generative AI’s incremental implementation mirrors the revolutionary Japanese manufacturing techniques that focused on reducing inefficiencies in small but meaningful ways. Whether deployed as just-in-time, just-in-case, or part of a hybrid approach, success with AI depends on strategic planning, thoughtful execution, and a clear understanding of business value.

By striking the right balance, organizations can unlock AI’s potential to deliver transformative outcomes, one carefully timed deployment at a time.

Recent updates

The Rise of Micro-Shifts: Redefining Work in the Era of Autonomy and Virtual Delivery Centers

Katyayani Seshampally • April 15, 2025

Discover how micro-shifts, poly-employment, and Virtual Delivery Centers are reshaping the future of work—moving from employer-owned models to worker-curated, modular livelihoods.

Reducing Patient No-Show Rates with Automated Scheduling and AI-Driven Engagement

Ashutosh Nayal • April 13, 2025

Reducing no-show rates is not a scheduling problem—it’s a systems problem. It demands a strategic blend of: Predictive AI, Mobile-first UX, Intelligent communication, Seamless data integration.

Improving QoS for Telecom CEOs and CTOs: Dynamic Bandwidth Allocation Strategies That Work

Krishna Vardhan Reddy • April 12, 2025

For modern telecom enterprises, delivering exceptional QoS is no longer optional—it’s a brand differentiator and a strategic lever for growth. Static provisioning models won’t cut it in a world of hyper-dynamic data usage.

How CTOs Can Future-Proof Warehousing with Automation and IoT

Sam John • April 11, 2025

Warehousing has shifted from being a backend function to a strategic differentiator. Today’s CTO must address multiple pain points simultaneously.