Pilot Purgatory - Part 3
Pilot Purgatory
A Story of AI Transformation
By Scott Weiner (AI Lead at NeuEon, Inc.), inspired by conversations with Erwann Couesbot (CEO of FlipThrough.ai)
Note: This is a work of fiction. All characters, companies, and events are fictional composites created for illustrative purposes. While the industry statistics cited are real and sourced, the narrative is designed to illuminate common patterns in enterprise AI adoption, not to depict any actual organization or individuals.
This is Part 3 of a serialized story exploring why enterprise AI initiatives fail—not from lack of technology or talent, but from invisible organizational dynamics that doom them from the start.
Reading the series for the first time? Start with Part 1: The Mandate
Missed Part 2? Read Chapter 2: Foundations
Previously in Pilot Purgatory…
The team’s early momentum produced impressive results: an 87% accurate chatbot that wowed leadership in a demo. Daniel’s late-night breakthrough with fine-tuning showed the technology could genuinely learn Thornfield’s domain. The build-versus-buy debate ended with a pragmatic “wrap” approach—build what’s unique, buy what’s commodity.
But warning signs emerged. Sarah Martinez documented security gaps in the OWASP Top 10 for LLM Applications, only to hear her concerns “added to the backlog.” Priya Gupta discovered that Thornfield’s data wasn’t ready—fifteen years of accumulated inconsistencies across four systems with three different schemas. The backlog, it turns out, is where priorities go to wait.
Now, with board pressure mounting and a Q2 deadline looming, Linda Chen faces an impossible task: evaluating 32 AI vendors with expertise she doesn’t have…
Chapter 3: Procurement
Linda Chen spread the RFP responses across her conference table like a dealer laying out cards for a high-stakes game. Thirty-two proposals. Thirty-two vendors. Thirty-two variations on promises she couldn’t verify.
She had been doing enterprise procurement for twenty-three years. She had evaluated ERP systems, cloud platforms, security tools, and manufacturing execution software. She knew how to read a proposal, how to spot the gaps between marketing language and delivery capability, how to negotiate terms that protected the company when implementations went sideways.
AI felt different.
The proposals all used the same vocabulary. “Enterprise-ready.” “Seamless integration.” “Industry-leading accuracy.” “Powered by state-of-the-art large language models.” She had highlighted these phrases in the first few proposals before realizing they appeared in all of them. Different logos, same buzzwords.
Her legal pad sat beside the stack, a habit from decades of vendor evaluation. Two columns: “What They Said” on the left, “What They Meant” on the right. The left column was filling up nicely. The right column remained mostly empty.
What did “seamless integration” actually mean when your data lived in four different systems with three different schemas? What did “enterprise-ready” mean for a technology that had been available for less than two years? What did “state-of-the-art” mean when the state of the art changed every six months?
She didn’t know, and that was the problem.
“Any luck?”
Priya Gupta leaned against the doorframe, coffee cup in hand, the relaxed posture of someone who had seen enough procurement cycles to find them almost amusing.
“They all say the same things,” Linda said. “Every single one promises they’re enterprise-ready, seamless integration, proprietary advantage.”
“What do you actually need from the data side?”
Linda appreciated that about Priya. Cut through the noise. Ask the real question.
“Native connectors to our warehouse. Schema validation on ingest. Lineage tracking for audit.” She tapped the stack of proposals. “And ideally, someone who admits their product won’t fix our data quality problems.”
“That last one isn’t in any of these proposals.”
“No,” Linda said. “It wouldn’t be.”
Priya entered the room, set down her coffee, and picked up one of the proposals at random. She flipped to the architecture section, her eyes scanning the diagrams with the practiced efficiency of someone who read technical documentation the way others read novels.
“This one assumes your data is already in a standard format,” she said. “See here? They’re pulling from what they call a ‘unified data layer.’ We don’t have that.”
“What do we have?”
“Fifteen years of accumulated product information spread across legacy systems, modern databases, and spreadsheets that someone created in 2008 and never migrated.” Priya set down the proposal. “The format changed three times during ERP transitions. Half of our product specifications exist only in PDF documents that no one has converted to structured data.”
Linda felt the familiar weight of a procurement decision that was harder than it should be. “Can we clean it up?”
“With time and resources, yes. But we’re talking six months minimum just to standardize the core product data. More if we want to include historical service records and customer interactions.”
“Marcus wants to launch by Q2.”
“Then Marcus will need to accept that the AI is only as good as the data we feed it.” Priya reclaimed her coffee. “Which is the part no one wants to hear.”
The vendor demos started the following week.
Linda had narrowed the field to eight finalists, chosen through a combination of reference checks, pricing analysis, and what she privately called the “realistic promises” filter. Vendors who had been honest about limitations made the cut. Vendors who had claimed their product could do everything out of the box did not.
The first demo was Nexus AI, a well-funded startup with an impressive client list and a sales team that arrived in matching branded polo shirts.
“What we’re offering,” the lead presenter said, clicking to a slide showing happy customers in a manufacturing environment, “is a complete AI transformation platform. From raw data to actionable insights, we provide the synergy between your existing systems and cutting-edge machine learning.”
Linda wrote “synergy” on her legal pad. First mention.
“Our platform enables seamless integration with legacy systems,” the presenter continued. “We’ve developed proprietary connectors that create synergy between disparate data sources, allowing our models to synthesize information across your entire enterprise.”
Second mention. Linda underlined it.
“What makes Nexus AI unique is our focus on enterprise synergy. We don’t just bolt on AI. We create a harmonious ecosystem where your data, your processes, and our technology work together as one.”
Third mention. Linda set down her pen.
“Can you show us how the connectors actually work?” she asked. “Specifically, how they handle data that isn’t in a standard format?”
The presenter’s smile didn’t waver, but something behind it flickered. “Absolutely. Our professional services team works with each client to customize the integration layer for their specific environment.”
“What does ‘customize’ mean in terms of timeline and cost?”
“That would depend on a detailed assessment of your current data architecture. But I can say that most implementations are complete within three to four months.”
“Most implementations,” Linda repeated. “What percentage?”
The presenter glanced at his colleagues. “I’d have to get back to you with specific numbers.”
Linda wrote in her “What They Meant” column: “We have no idea.”
The second demo was Quartzvane, an established enterprise software company that had pivoted to AI eighteen months ago. Their presentation was more polished, their claims more measured, their slides filled with metrics and case studies.
“We’ve processed over 4 billion documents across our client base,” the presenter said. “Our accuracy rates consistently exceed industry benchmarks, and our enterprise customers report an average of 30% improvement in operational efficiency within the first year.”
“How do you define operational efficiency?” Marcus asked. He had joined for this demo, sitting at the back of the room with the quiet attention of someone evaluating an investment.
“That’s an excellent question. Each client defines their own success metrics based on their specific use cases. The 30% figure is an aggregate across all client-defined metrics.”
“So it could mean different things for different clients.”
“Exactly. That’s the beauty of our platform. It adapts to your definition of success.”
Linda recognized the move. Answer the question without answering the question. Make vagueness sound like flexibility.
“What about data quality requirements?” Priya asked. She had joined the demos at Linda’s request, the technical ballast against the marketing swell. “Our product data has significant inconsistencies. How does your platform handle schema variations?”
“We have robust data normalization capabilities built into our ingestion layer. Our AI can identify and correct most common data quality issues automatically.”
“Automatically,” Priya repeated. “Can you show us an example? Specifically, how it handles conflicting product specifications from different source systems?”
The presenter pulled up a demo environment. Clean data flowed through colorful dashboards. Charts updated in real time. Everything worked perfectly.
“This is beautiful,” Priya said. “But this isn’t our data. This is your demo data. Can we run a test with an actual extract from our systems?”
A pause—the kind that happens when a question exposes the gap between demonstration and reality.
“We typically do that during the pilot phase, after the contract is signed.”
Priya caught Linda’s eye. The look said: they don’t know how messy our data is.
By the fifth demo, Linda had developed a system.
She tracked the number of times each vendor used the word “synergy” (average: 2.7). She noted how long it took them to defer to “professional services” when asked technical questions (average: four questions). She recorded how many times they showed their own demo data versus client data (100% their own).
The patterns were consistent: everyone promised transformation, but no one could explain how the transformation would work with Thornfield’s specific constraints.
“They’re all selling the same thing,” she told Marcus after the seventh demo. “Different packaging, same product.”
“Which one is best?”
“I don’t know how to answer that. They’re all equally confident and equally vague. It’s like trying to choose between weather forecasters who all predict sunny skies.”
Marcus leaned back in his chair. The stress ball on his desk, the one from some vendor conference she couldn’t remember, rolled slowly across the surface. He caught it without looking.
“What does your gut say?”
“My gut says we’re buying something we don’t fully understand from vendors who don’t fully understand our needs.” She paused. “But my gut also says we have to buy something. David wants results. Jennifer wants progress. We can’t build everything ourselves.”
“So we pick one and hope for the best?”
“We pick the one that seems most honest about limitations and hope their professional services team is as good as they claim.”
Marcus squeezed the stress ball. The logo on its surface had been worn smooth years ago, the identity of the vendor long forgotten. “Which one is most honest?”
Linda thought about the demos she had watched, the confident smiles and polished slides that never quite addressed the questions that mattered.
“The one with the most buzzwords?” she said, not quite joking.
Priya found her crisis in the data warehouse.
She had known the data was messy. Twenty years of data engineering had taught her that enterprise data was always messy, accumulated like sediment layers in a geological survey, each era of technology leaving its mark in flat files from the pre-database days, relational tables from the ERP implementation, and document stores from the content management migration.
But knowing the data was messy and quantifying the mess were different experiences.
She started with the product specifications, the core dataset that would feed the AI models. Fifty-three thousand SKUs, each with technical specifications, pricing tiers, compatibility matrices, and service histories.
Except the specifications weren’t consistent.
For products introduced before 2008, specifications were stored in a legacy format that used imperial units exclusively. Products from 2008 to 2015 mixed imperial and metric depending on which engineer had done the data entry. Products after 2015 used metric only, except for customer-facing dimensions, which remained imperial for the North American market.
Temperature ratings used different scales, weight tolerances used different precision levels, and mounting specifications referenced technical drawings that existed only on paper in a filing cabinet somewhere in the engineering department.
She pulled a sample set: one thousand random products from across the timeline. She wrote queries to identify inconsistencies. She built reports that visualized the variation.
The results were worse than she had expected.
Forty-three percent of products had at least one specification that conflicted with another source. Eighteen percent had specifications that were simply missing. Seven percent had specifications that were demonstrably wrong, referencing materials or dimensions that didn’t exist.
She thought about the AI models Daniel was building. Models that would read this data and make predictions. Models that would tell customers about product specifications. Models that would generate confident, authoritative answers based on information that was, in many cases, garbage.
Garbage in, garbage out. The oldest truth in data engineering.
She documented her findings in a report that took three days to write. Charts showing the variation. Tables showing the error rates. Recommendations showing what it would take to fix the foundation before building on top of it.
Six months minimum, two full-time data engineers, a project to standardize schemas, validate entries, and fill gaps, the kind of work that wasn’t exciting, didn’t demo well, and would never make it into a board presentation.
She saved the report and scheduled a meeting with Marcus.
“Six months minimum,” Priya said. “That’s not pessimism. That’s math.”
Marcus studied the charts she had brought to his office. The visualizations were clear, the conclusions unavoidable. The data wasn’t ready. The data might never be ready, not in the way the AI vendors assumed.
“We’re supposed to launch by Q2,” he said.
“Then we launch with data that will produce unreliable results.” Priya’s voice was calm, factual. She wasn’t arguing. She was reporting. “The models will learn from inconsistent specifications. They’ll make predictions based on errors. And they’ll do it with the confidence that makes AI both powerful and dangerous.”
“What if we focus on a subset? The products from 2015 onward, where the data is cleaner?”
“That’s twelve thousand SKUs out of fifty-three thousand. Less than a quarter of our catalog.” She pointed to a chart. “And even within that subset, we have issues. Customer service records that reference the wrong product codes. Pricing tiers that don’t match the current system. Compatibility information that was never migrated from the old format.”
Marcus set down the report. Through his office window, the engineering floor was visible, Daniel’s team clustered around monitors, building models that would consume whatever data they were given.
“What do you recommend?”
“Start small. Pick a hundred products with verified specifications. Train models on that subset. Expand only when we can verify the data underneath.” She paused. “It’s slower, but it’s honest.”
“David won’t like slower.”
“David won’t like launching a system that gives customers wrong information either.”
Marcus rubbed his eyes. The stress ball sat on his desk, untouched for once. “Do what you can with what we have. Flag the highest-risk areas. We’ll work around the gaps.”
Priya recognized the phrase. Do what you can. The enterprise equivalent of “figure it out.” A decision that wasn’t quite a decision, a direction that wasn’t quite a direction.
“I’ll do what I can,” she said. “But I want it on record that I recommended a different approach.”
“It’s on record.”
She left his office knowing that the record wouldn’t matter much when things went wrong. Records never did. What mattered was results, and the results were going to be built on a foundation that wasn’t ready to hold them.
The contract with Quartzvane was signed on a Tuesday afternoon.
Linda had negotiated hard, extracting concessions on implementation timelines, professional services hours, and exit clauses if certain milestones weren’t met. The final number was $800,000 for the first two years: platform licensing, integration support, and training for the Thornfield team.
Marcus signed the authorizing documents in his office, David’s approval already secured through a series of emails and one brief conversation that emphasized competitive pressure more than technical readiness.
“I hope this is the right choice,” Linda said, watching Marcus’s pen move across the signature line.
“It’s a choice,” Marcus replied. “Right or wrong, we’ll find out in six months.”
“That’s not particularly reassuring.”
“I know.” He set down the pen. “But we couldn’t wait forever. The board wants progress. David wants results. This gives us something to show.”
Linda gathered the signed documents. Through Marcus’s window, she could see the engineering floor where Daniel’s team was already integrating with Quartzvane’s APIs. The work was real. The money was real. The timeline was real.
Only the confidence was manufactured.
“Their professional services team starts next week,” she said. “I’ve scheduled the kickoff for Monday.”
“Good. Let’s make this work.”
She nodded and left, the contract heavy in her hands. Eight hundred thousand dollars committed to a platform she hoped would deliver, based on demos that had been impressive and data that wasn’t ready.
In her office, she added a final note to her legal pad. Under “What They Said,” she wrote: “Complete solution for your AI transformation needs.”
Under “What They Meant,” she finally had an answer: “TBD.”
The professional services team arrived with laptops and optimism.
Three consultants from Quartzvane arrived with impressive credentials and the practiced patience of people who had seen enterprise implementations go sideways before, setting up in a conference room that became their war room as whiteboards filled with architecture diagrams and integration plans.
The first week was discovery: interviews with stakeholders, documentation reviews, system access requests. The consultants asked questions, took notes, and maintained expressions of professional neutrality that Linda had learned to read as concern.
“Your data architecture is more complex than we anticipated,” the lead consultant told Marcus at the end of week one. “The legacy systems have integration patterns we don’t see often anymore.”
“Can you work with it?”
“We can work with anything, given time and resources.”
“How much time? How many resources?”
The consultant glanced at his colleagues—the kind of glance that happens when a project scope is about to expand.
“We’ll have a revised estimate by end of next week.”
Marcus nodded, the stress ball finally in his hand, squeezing rhythmically. “I appreciate your honesty.”
“That’s what you’re paying for.”
But honesty, Linda knew, came with a price of its own. The revised estimate would add months. The integration work would add costs. And the data quality issues that Priya had documented would emerge, one by one, like rocks appearing as a tide went out.
She thought about the demos she had watched. The clean data flowing through colorful dashboards. The confidence of presenters who had never seen Thornfield’s fifteen years of accumulated inconsistencies.
The contract was signed, the money was committed, and the timeline was already slipping, while somewhere in the foundation they were building, data gaps waited to become the cracks that would eventually show.
To be continued…
What happens next: Daniel Park, the engineer who made the chatbot work, receives an offer he can’t refuse: a 40% raise, pure AI work, no legacy maintenance. His departure triggers a talent drain crisis and forces Marcus to confront an uncomfortable truth—he trained someone valuable, and now someone else gets the value. Chapter 4 reveals how losing one key person can unravel months of progress.
Part 4 publishes January 21, 2026.
Why we wrote this
Scott Weiner is the AI Lead at NeuEon, Inc., where he helps organizations navigate the complexities of AI adoption and digital transformation. This story draws from patterns observed across dozens of enterprise AI initiatives.
Erwann Couesbot is the CEO of FlipThrough.ai, specializing in AI strategy for professional services. His conversations with technology leaders inspired many of the dynamics explored in this narrative.
Reading the series for the first time? Start with Part 1: The Mandate
Missed Part 2? Read Chapter 2: Foundations
Want to read the complete story?
Have your own AI transformation story? We’d love to hear it. Connect with Scott on LinkedIn or reach out to NeuEon at neueon.com/contact.
