AI Engineering in the Real World: From Codebases to LLMs, and the Rise of the Virtual Delivery Center

Once a buzzword reserved for academia or niche ML teams, AI Engineering has swiftly become a practical, in-demand field that is reshaping how software products are conceived, developed, and deployed. But as companies scramble to build AI-powered features, what does AI engineering look like on the ground—inside real engineering teams, real businesses, and real products?

This isn’t about hypothetical use cases. This is a deep dive into the lived reality of developers who’ve crossed the chasm—from traditional software engineering to what is now being called AI engineering. Across seven companies—spanning HR tech to legal AI to incident response platforms—we uncover the learnings, tools, challenges, and the evolution of team structures that are powering this transition.

But before we explore the tech and teams, let’s answer the big question:


What Exactly Is AI Engineering?

Ask five people and you’ll get five definitions.

Some equate it with machine learning engineering—training models, optimizing parameters. Others stretch it to include data scientists working with large language models (LLMs). A growing camp, however, aligns with Chip Huyen’s framing in her book AI Engineering: building applications that use LLMs—sitting somewhere between software engineering and ML engineering.

In most cases, AI engineering today looks like this: software engineers building new product capabilities using LLM APIs, orchestrating context, chaining prompts, and integrating outputs into real-world workflows. As these applications get more complex, they begin to resemble ML workflows—complete with fine-tuning, eval pipelines, and RAG architectures.


What Are Companies Actually Building?

Across domains and team sizes, the real-world use cases fall into a few buckets:

1. Augmenting Incident Response: Incident.io

This platform—used to manage outages—added an AI-powered incident note taker and an AI investigator agent. These tools summarize live conversations, identify actions, and even analyze code and logs in real-time.

Stack: Postgres + pgvector, GPT-4o, Sonnet 3.7, Go backend, React frontend, Kubernetes on GCP.

2. Developer Productivity: Sentry

From bug reports to PRs, Sentry’s Autofix uses RAG + telemetry + embeddings to trace issues and generate fixes, while Issue Grouping uses advanced ANN techniques to reduce alert fatigue.

Stack: Custom LLM agent architecture, PostgreSQL, ClickHouse, PyTorch, Kubernetes, in-house frameworks.

3. Legal AI Assistants: Wordsmith

They’re reimagining how legal teams work. Their AI Contract Reviewer flags and summarizes potential legal risks in docs. Their workspace integrates with enterprise comms to auto-draft and analyze.

Stack: Pinecone, LangChain, LlamaIndex, multi-cloud routing (AWS Bedrock, Azure OpenAI, GCP), Groq for speed-critical tasks.

4. AI Coding Assistants: Augment Code

With IDE plugins and Slack bots, their tools are supercharged by fine-tuned LLMs on large engineering corpuses.

Stack: Google Cloud, NVIDIA GPUs (CUDA), A3 Mega clusters, PyTorch, custom training libraries.

5. RAG for Science & Medicine: Elsevier

When multiple teams started building their own LLM products, Elsevier created a shared RAG Platform to serve enterprise-wide AI needs.

Stack: LangChain, FastAPI, OpenSearch, AWS Bedrock & Azure OpenAI, Snowflake, Airflow, Fargate.

6. Regulated Chatbots: Simply Business

They built a compliant chatbot that only outputs pre-approved answers. It refuses to hallucinate and gracefully defers to humans when unsure.

Stack: AWS Bedrock, Anthropic Sonnet 3.5, Rails on ECS, approved-answer KB.

7. Survey Summarization: Data Solutions International (DSI)

A small HR tech firm, DSI built a summarization tool to cluster thousands of employee survey comments. With no prior AI background, they built it in-house after rejecting a bloated agency proposal.

Stack: AWS Bedrock, Cohere Embed v3, PostgreSQL with pgvector, Java backend, React frontend.


The Hidden Engine Behind All This: Virtual Delivery Centers (VDCs)

What’s enabling these transitions—from zero to production-grade GenAI features?

Enter the Virtual Delivery Center (VDC) model.

Traditional team-building approaches simply can’t keep up with the experimentation, cross-domain knowledge, and rapid prototyping that AI engineering demands. You need a fluid, fast, and flexible model—something VDCs are purpose-built for.

A VDC is essentially an on-demand, cloud-native engineering team—assembled with pre-vetted talent, available instantly, scalable as needed, and capable of owning delivery end-to-end. It’s not just freelancing. It’s structured execution.

In a VDC model:

  • A company doesn’t have to hire AI engineers full-time or sift through resumes.

  • Instead, it spins up a delivery center in the cloud, powered by experts in LLMs, RAG, prompt engineering, evals, fine-tuning, and more.

  • These experts build, iterate, and deliver—while the company keeps strategic oversight.

Several of the examples above—especially from smaller teams like DSI or early-stage startups like Wordsmith—mirror the VDC ethos, even if not formally labeled as such. It’s the new operating model for building with GenAI.


Getting Started as a Software Engineer in AI Engineering

Most transitions begin not with fancy degrees, but with curiosity and initiative.

Take Ryan Cogswell from DSI. No formal AI background. His company considered an agency which proposed an overengineered solution that would cost more than their total infra. Instead, Ryan learned, prototyped, and delivered a better, leaner product in 2 months.

His stack? AWS Bedrock, Cohere, PostgreSQL, Java, and React. No SageMaker. No Lambdas jungle. Just practical, understandable components.


Navigating Non-Determinism and Vibe-Based Dev

AI doesn’t behave like conventional software. As Ross McNairn of Wordsmith puts it:

“You don’t debug a prompt like a function. You iterate. You test. You vibe-check.”

This non-determinism is what makes AI hard. Evaluation is often qualitative. The challenge is as much about engineering mindset shift as it is about tech.

Matt Morgis left an engineering manager role to become an IC again, just to work with GenAI. For seasoned engineers, the joy of creation returns. But for juniors, these tools can hinder learning fundamentals. Knowing when and how to use them matters.


Cost and Complexity: More Than Just Tokens

Yes, LLMs are expensive. But cost isn’t just about API fees—it’s about architecture choices, fine-tuning, latency, user trust, privacy, and more.

For example:

  • Incident.io built in-house tools for better latency control.

  • Elsevier centralized infra to reduce duplication.

  • Simply Business used guardrails and fallback mechanisms for compliance.

Smart teams simplify, iterate, and choose constraints wisely.


Common Tech Stack Trends Emerging

Across all these use cases, a few trends stand out:

Component Why It’s Common
AWS Bedrock Secure, enterprise-grade access to LLMs (especially Anthropic)
PostgreSQL + pgvector Familiar, simple vector DB for fast prototyping
LangChain (with caution) Useful orchestration layer, though some prefer custom frameworks
Cohere, Groq Specialized performance or embedding advantages
Kubernetes / ECS For scaling infra, especially as usage spikes
React + FastAPI Lightweight UIs and APIs to bring LLM features to users

But here’s the deeper point: there’s no perfect stack. The best stack is the one you can understand, evolve, and optimize.


Final Thoughts: The Future Is in the Builders’ Hands

AI engineering isn’t just about adopting new tools—it’s a full-stack rethinking of how we build, deploy, and maintain intelligent software.

We’re entering a new era—where coding meets cognition, and software becomes conversational. Those who can bridge traditional engineering discipline with the experimental spirit of AI will lead the next wave.

And increasingly, they’ll do it through Virtual Delivery Centers—cloud-based teams that give organizations the power to explore, test, scale, and productize AI at speed, without being bogged down by hiring, procurement, or overhead.

If software is still eating the world, AI engineers are now picking the menu.

Schedule A Meeting To Setup VDCovertime

Recent updates
Beyond Check-Ins and Chatbots: AI’s Grand Redesign of Travel

Beyond Check-Ins and Chatbots: AI’s Grand Redesign of Travel

AI is not just enhancing travel—it’s rewriting it. From hyper-personalized discovery to operational orchestration, here’s how travel is entering its AI-powered age—with Virtual Delivery Centers transforming execution on the inside.

The Women’s Health Revolution: A Trillion Dollar Opportunity for Equity, Innovation, and Global Growth

The Women’s Health Revolution: A Trillion Dollar Opportunity for Equity, Innovation, and Global Growth

Closing the women’s health gap could unlock $1 trillion in global economic impact. Explore how equity, innovation, and Virtual Delivery Centers (VDCs) can drive transformation in healthcare and prosperity for all.

Eco-AI at Scale: How Visionary CEOs Can Lead the Sustainable AI Revolution

Eco-AI at Scale: How Visionary CEOs Can Lead the Sustainable AI Revolution

Discover how CEOs can lead the integration of AI and sustainability using a carbon-conscious “eco-AI” approach. Learn the role of Virtual Delivery Centers in driving low-emission, high-impact AI at scale.

Beyond Cost Cutting: How the Next Generation of Supply Chains Will Be Built on AI, Automation, and Virtual Delivery Centers (VDCs)

Beyond Cost Cutting: How the Next Generation of Supply Chains Will Be Built on AI, Automation, and Virtual Delivery Centers (VDCs)

For years, companies have approached supply chain cost-cutting as a short-term fix. That era is over.

overtime