In 1900, Every Serious Manufacturer Had a Coal Strategy

In the spring of 1913, inside a brick cathedral on the edge of Detroit, William B. Mayo — Henry Ford's chief power engineer, a Massachusetts man who had cut his teeth on marine steam plants — walked the gantry of the new powerhouse and signed off the load tests. Twin turbines. Eight boilers. A forest of brass dials. Fifty-three thousand horsepower generated on site, piped out as alternating current through the newly redesigned Highland Park plant. It was the largest privately owned electrical station in America. The trade press called it a marvel. The management consultants of the day — there were such creatures — called it inevitable. Every serious manufacturer, they explained, needed an energy strategy before an electrification strategy. Mayo had spent three years specifying this beast. He believed, as every serious power engineer of his generation believed, that a factory worthy of the name generated its own current, and that renting electrons from somebody else's wire was a confession of weakness dressed up as accounting. He was, by all accounts, quietly pleased.

Thirty years later, nobody who built a factory did this. They plugged it into the grid and got on with the work. Mayo's turbines were scrap.

Here is the thesis, and I am going to put it in the third paragraph so you cannot miss it. The size of your data work must match the blast radius of what you let the AI do — and whether you can un-do the damage when it gets it wrong. Blast radius is what the AI is allowed to touch. Reversibility is whether a wrong answer is an embarrassing paragraph or a regulator on the phone. Put those two axes together and you get four quadrants, and the work looks different in each. So: read-only retrieval, write-enabled agency, foundation-model training, and classical machine learning — four quadrants, four disciplines, four separate answers to the question the orthodoxy insists on answering only once. The industry is selling you one answer, priced as though every situation were the hardest of the four, and calling it "data strategy first." It is not a strategy. It is a prerequisite mis-specified at every quadrant by people whose revenue depends on the mis-specification.

I run into this argument every week. Here is why I no longer buy it

Article content
AI Generated William B.Mayo's 1913 Highland park powerhouse - 53,000 horsepower generated on site, and scrap within thirty years

The prerequisite that was always a sales pitch

Start with a fact that ought to be more embarrassing to its promoters than it is. The phrase "AI-ready data" — deployed as if it were handed down on stone tablets — was coined by Gartner in 2024, not 2014. The doctrine that you must have a data strategy before you touch a large language model hardened as received wisdom after ChatGPT landed. It is younger than the wave it claims to precede. The orthodoxy was retrofitted to the moment it was meant to predict.

And who is selling it? The data-platform vendors whose product categories were looking orthogonal to the action, the consultancies whose billable ETL years were at risk, and the CDO profession itself. The canonical survey figure — 93% of Chief Data Officers telling pollsters that an effective data strategy is essential — is not a finding. It is a confession of self-interest, rendered in the third person.

Look at what the evidence actually says. MIT's NANDA initiative, looking at over three hundred enterprise generative AI initiatives in August 2025, found that 95% of pilots deliver zero measurable profit-and-loss impact. A brutal number, and one the vendors love to quote because it sounds like it proves their case. But read the next paragraph — the one they tend not to quote — where MIT names the root cause. It is not data. It is what they call the learning gap: most generative AI systems do not retain feedback, do not adapt to context, and do not improve over time. The successful initiatives bought capability from vendors and wired it into a workflow. Internally built systems succeeded a third of the time. Vendor-bought systems succeeded two thirds of the time. The bottleneck is showing up in the feedback layer, not the data layer — and that is a problem that cannot be solved by buying more of what the incumbents are selling.

Why the four quadrants are the right unit of analysis

Back to Detroit. Back to Mayo on his gantry. Hold the four-quadrant frame in your head while I walk you through the history, because the history is where the framework earns its authority.

The story of factory electrification is the cleanest historical analogue we have for what is happening now, and it has been sitting in plain sight since Paul David published "The Dynamo and the Computer" in May 1990. Alternating current was commercially available from Niagara Falls in 1896. American manufacturing productivity then did what productivity statistics do when a general-purpose technology arrives and nobody knows how to use it yet: absolutely nothing, for thirty years.

Why? Because the factories of 1900 had been built around the line shaft. A single prime mover spun a long iron axle running the length of the shop floor, and every tool in the building hung off it with leather belts. The architecture was the constraint. When you electrified by swapping the steam engine for a central electric motor (what Warren Devine called "group drive"), you changed the energy source but kept the architecture. You got the same factory, slightly cheaper to run, and none of the productivity gains — because none of the productivity was hiding in the energy source. It was hiding in the architecture.

Unit drive — giving every machine its own motor, freeing the layout from the tyranny of the shaft, letting materials flow in straight lines — is what made Highland Park work. And yet, in 1913, Ford still had Mayo build his own fifty-three-thousand-horsepower generating station, because the orthodoxy of the day said you needed to own your own power. By 1930 nobody built their own generating station. The grid did the job and did it better. The prerequisite had dissolved into a utility bill. The real transformation — the ten-fold productivity jump — came from the rearrangement, not the resource.

Draw the AI landscape on the same evolution axis and the picture sharpens. Foundation models are sliding visibly towards commodity — you rent them from a shrinking number of providers at prices that halve every eighteen months. Retrieval-augmented generation and vector search have just commoditised: both are now native column types in Postgres, Oracle, Snowflake and Databricks, which means the thing the vendors were calling a moat last year is a feature of the database this year. The semantic layer, by contrast, is still stubbornly custom-built and always will be, because it encodes the specific meaning of your business; ETL and the warehouse beneath it have been commodity for fifteen years. Four components, four evolution stages, four different answers — and the orthodoxy sells you the same three-year programme for all of them.

Article content
AI Generated: Group drive versus unit drive - the ten-fold productivity gain was hiding in the architecture, not in the energy source

The category error in 2026

Coal was consumed. Data is not consumed — it is referenced. The 2015-vintage definition of data strategy — master data management, enterprise data warehouse, governance-first, build-the-lake-then-stock-it-with-fish — is not the foundation of modern AI. It is orthogonal to it.

Wavestone's Data and AI Leadership Executive Survey reports that "data-driven culture" inside large enterprises jumped from 21% in 2023 to 43% in 2024, and the researchers were explicit that the cause of the jump was generative AI. The cause. For ten years the CDO profession had been trying to get executives to care about data and getting nowhere. Then ChatGPT arrived and within eighteen months the board was asking the questions the CDO had been begging them to ask for a decade. The AI pulled the data along behind it.

And the profession knows. IBM's 2025 CAIO Global Study found that 26% of organisations now have a Chief AI Officer, up from 11% two years earlier — 48% of the FTSE 100. Jamie Dimon told an investor call in 2024 what he had done inside JPMorgan: "We took AI, slash data, out of technology. It's too important." The CDO role is being quietly folded into the CAIO role, because the market has decided the thing you optimise for is the outcome, not the upstream dependency.

I am not arguing the CDOs were frauds. Many of them are excellent. The thoughtful ones have spent a decade doing necessary, unglamorous work inside a category the market had wrongly framed — and the invitation in this piece is for them to walk out of the old category and into the new one, because their craft is needed there and the title is not what matters.

The steelman, and why it doesn't bury the argument

I owe you a fair hearing of the other side. Here is the best version of it, and then where I think it holds and where it crumbles.

The strongest objection comes from agentic compounding. The maths is brutal. A twenty-step agent running at 95% per-step reliability delivers a 36% end-to-end success rate. Drop each step to 85% — which is what you get when the context is noisy, schemas drift, and records are duplicated across three systems — and you are down to 4%. Bain's 2025 report makes exactly this point, and eight out of ten executives tell Bain that data limitations are the number-one blocker to scaling agentic AI. This is the single argument the orthodoxy gets right. No foundation model is clever enough to paper over a duplicate customer record spread across Salesforce and NetSuite with conflicting addresses. Notice the quadrant, though. It lives in the high-blast-radius, low-reversibility corner, and only there.

The second objection is the NYC MyCity chatbot. Built on Azure AI, grounded on two thousand curated dot-gov pages, and for eighteen months it cheerfully told small business owners they could steal tips from their staff and refuse cash payments. State-of-the-art stack. Authoritative sources. And it still lied with confidence because nobody had modelled the entity relationships or tagged which regulations superseded which. RAG on chaos is still chaos. That is a read-only case where reversibility was badly under-estimated — small business owners taking legal advice from a chatbot and then getting sued is not damage you can un-do with a product update.

The third objection is the EU AI Act, Article 10, binding from August 2026, which requires training and reference datasets to be "relevant, representative, free of errors and complete" for high-risk systems, with fines up to €15 million or 3% of global turnover. The Italian Garante has already fined OpenAI €15 million on lawful-basis grounds. If you operate inside any regulated sector, "no data strategy" now translates as "I will write one under enforcement pressure, next year, at four times the cost." That is regulation choosing the quadrant for you.

The fourth objection is the semantic layer, and this is the one I find most honest. A language model cannot tell you the Q3 margin on EMEA enterprise customers unless somebody has reconciled Q3, margin, enterprise customer, and EMEA across Salesforce, NetSuite, the warehouse, and the product database. BI was forgiving of missing semantics — an analyst would simply ask the human who knew. Language models are not forgiving. They will give you a confident, beautifully formatted, completely wrong answer. The uncomfortable version, put by people who are not trying to sell you a platform, is this: AI does not require a data strategy so much as it exposes the fact that you never really had one. The Open Semantic Interchange spec, pushed out by Snowflake, dbt, Salesforce and Mistral in January 2026, is the industry's belated admission that the semantic layer is what matters, not the warehouse beneath it.

Article content
AI Generated: A rusted line shaft and a modern GPU rack - every era has its prerequsite, sold by the people whose revenue depends on the wait

I concede the maths on agentic compounding. I concede NYC MyCity. I concede the EU AI Act. I concede that the semantic layer is irreducible. And yet none of that rebuilds the orthodoxy. Because the orthodoxy is not "match the data work to the blast radius and the reversibility of the damage." The orthodoxy is "build a three-year, eight-figure enterprise data programme before you are allowed to touch a language model." Each of the four objections picks out a specific quadrant where the data work is irreducible. None justifies pricing every situation as if it were the hardest quadrant. The 2026 version of a data strategy, sized per quadrant, is roughly a tenth the size of the 2019 version, and it runs in parallel with the AI work, not in front of it.

Who benefits from the prerequisite? And who owns the utility?

If the prerequisite has dissolved into a utility bill, who owns the utility? The answer, in 2026, is that three companies own the power stations — AWS, Azure and GCP — and a handful of foundation-model landlords rent you the turbines on top. Anthropic. OpenAI. Google DeepMind. The 1930 grid was contested, regulated, in many places publicly owned; cheap power was treated as a political question because everyone understood that whoever controlled it controlled industry. The 2026 equivalent is none of those things. It is a private, unregulated, four-provider oligopoly that the "rent, do not build" recommendation I made two sections ago quietly accelerates. I am not resolving that question here. I am not equipped to. But I refuse to pretend it is not sitting there, because "renting" and "being enclosed" may turn out to be synonyms on a long enough timeline, and the managerial argument does not get to duck the structural one underneath it. I would rather leave you holding the question honestly than palm you off with closure I have not earned.

Management is moral work. When a consultant or a vendor tells you that you must do X before you are permitted to do Y, the first question is not whether the claim is true. The first question is: who is paying the cost of the wait? Not the consultant. Not the vendor. The customers whose problems could have been solved in month three but will now be solved in year three. The colleagues who spend eighteen months in a governance committee re-litigating what a "customer" is across four systems, producing a unified model that will be obsolete before it ships. I have seen a hundred data strategy programmes in my career. Most were a polite way of saying "we are not ready to change anything yet, and this gives us eighteen months to avoid admitting it." When I ask the people pushing hardest for data-strategy-first whether they would personally bet their own money on a three-year programme delivering more value than a ninety-day vendor-led pilot wired to a narrow workflow, they go quiet. Every single time.

What I would do on Monday morning

Here is the test I would apply, if I were sitting where you are sitting, reading this on a Sunday evening because you have a board meeting on Tuesday.

Pick a single workflow where the outcome is measurable in pounds, dollars or euros within ninety days. Not strategic. Not transformational. Measurable. Name the quadrant it lives in — read-only, write-enabled agent, foundation-model training, or classical ML — and name, out loud, how reversible a wrong answer would be. Size the data work to match the quadrant and not a gram more. Build the evaluation harness before you build the system, because MIT is right about the learning gap and you are going to need it. Rent the capability. Do not build it. Wire it to the workflow. Measure. Learn. Move to the next workflow. Do this six times in a year, across the quadrants that actually matter to you, and then tell me whether you need a three-year data strategy to precede your AI strategy, or whether you have already accidentally built the only data strategy that matters.

I guarantee, with history on my side, the answer is the second one.

The factory owners who won the electrification transition in the 1920s did not win because they built the biggest powerhouses. They won because they rearranged their shop floors around what unit drive made possible. The ones who lost spent the 1900s building ever larger on-site dynamos, because they treated the energy source as the strategy and the architecture as an afterthought. The technology was not the bottleneck. The imagination was. Mayo, for what it is worth, retired a respected man. His turbines still got scrapped.

Technologies change. Human nature does not.

(Views in this article are my own.)