STYLE SHEET
GLOBAL CSS
STYLES
ANIMATIONS
MEDIA QUERY
LUMOS
The case for democratising compute in a multi-model world
00:00
00:00
00:00
00:00
00:00
00:00
AI’s next battleground is infrastructure, not models.

It’s February 2025, and a Chinese research lab releases a model that appears to punch far above its weight. 

Deepseek, trained on what was believed to be constrained hardware (Nvidia’s H800s), posted evaluation benchmarks that rivalled some of the most advanced proprietary models in the world. The immediate reaction was predictable and dramatic: If cutting-edge AI could be built on less compute, had the market mispriced the value of GPUs? Had Nvidia’s growth curve finally met its ceiling?

“Absolutely not, in my opinion,” says Tim Davis

Tim’s seen this movie before. Not as an observer, but from inside the machine rooms where modern AI had quietly been assembled years earlier. Over 7 years at Google, he and his now co-founder worked alongside the teams that built the company’s vertically integrated AI stack: custom silicon, custom software and some of the earliest large-scale deployments of inference at internet scale. Now, as Co-Founder and President of Modular, an AI infrastructure company that has raised $380m at a $1.6bn valuation, he’s building the software layer he believes the entire industry will need. 

What the Deepseek moment revealed, he argues, wasn’t a breakthrough in efficiency, but a widespread misunderstanding of where AI actually scales. Once more details emerged, it became clear the model was trained using massive amounts of compute. The real shift wasn’t in training efficiency; it was how people were thinking about the economics. 

While training large models grabs headlines, it’s largely a one-off cost in the tens or hundreds of millions. Inference, by contrast, is a recurring expense that compounds at scale, quickly reaching billions and shaping AI’s long-term cost curve.

This conversation was recorded in December 2025.

From model supremacy to model-workload fit

In recent years, AI discourse has been framed as a race toward a single dominant model which is the smartest system, trained on the most data, backed by the most compute. 

But Tim disagrees. 

“I think the idea that one single lab is going to dominate all AI isn’t true at all,” says Tim. 

Instead, what he sees emerging isn’t model supremacy, but model-workload fit, as each use case—code generation, customer support, search, internal tooling and more —demands a different balance of intelligence, latency and cost. 

“I don’t think there’s going to be one specific thing that’s going to unlock or drive people to any one singular service,” says Tim. “Anthropic is strong in code. OpenAI has consumer momentum. Google has search and a vertically integrated stack”. 

Bret Taylor and the team at Sierra describe this as a “constellation of models”; not one system, but 15+ different models optimised for specific workloads. Open Router's State of AI confirmed this in practice: usage disseminating across different models depending on use case, with no single provider dominating all workloads. The constellation matters more than the brightest star.

Google’s long bet on inference economics 

To understand why this fragmentation is inevitable, Tim points back to decisions Google made nearly a decade ago, long before ChatGPT became a household name. 

Back in 2015, Jeff Dean, one of the world’s most influential computer scientists and the leader of the Google Brain project (which later merged with DeepMind), predicted that the compounding nature of compute would be a limiting factor for Google. Google could keep purchasing Nvidia GPUs, or it could build its own silicon.

“Google has what is colloquially known in Silicon Valley as NIH—Not Invented Here. They have a huge desire to own everything,” says Tim.

To Jeff’s credit, that forethought fundamentally changed the economics for Google.

When Google designed Tensor Processing Units (TPUs), they weren’t for widespread commercial marketability. They were designed for cost-per-query, arguably one of the most important metrics in a consumer business serving billions of users. By driving that cost down, Google dramatically increased the leverage of its ad-driven business model. 

The payoff wasn't immediate.

In 2017, Google confronted a shift that much of the market would only recognise years later: the training-inference flip. Training scales to the size of your research team, which may range from dozens to hundreds of people. Inference scales to the size of your user base: billions.

When Google deployed BERT and saw relevancy improvements that materially lifted revenue, they realised they needed to scale inference by a huge margin. The compute economics they’d optimised for training suddenly looked trivial compared to what serving these models would require. 

“If you jump forward to today, Google now has a huge advantage. Everyone is looking at Google how they’ve looked at Apple, with its own completely vertically integrated stack, its own silicon, its own operating system and an amazing team,” says Tim.

“They took a bet on building TPUs and then a full software stack for TPUs that they could scale,” he adds. “What everyone’s anticipating is the cost not only of training but of inference for them [Google] will be significantly lower, so it’s going to be difficult for these frontier labs to compete.”

While Google’s dominance may have been questioned for a period, by December 2025 the company accounted for 20% of all gains in the S&P 500 for the year. 

But Google’s advantage is also its constraint: TPUs only work inside Google. And it was during this formative period at Google, Tim saw an opportunity to build the software layer that could for the rest of the market what Google had done for itself. 

Building the hypervisor for AI compute 

While working at Google, Tim met Chris Lattner, the engineer behind LLVM and Swift, and one of the most influential compiler architects of his generation. The years they worked together became, in effect, extended product research that shaped three core insights behind what they would later build: Modular.

Those insights were:

  1. There’s a fragmentation and consolidation paradox: Google, arguably the world’s most advanced company, relies on three infrastructure stacks (TPUs for research, GPUs and CPUs for general workloads, and highly optimised systems for edge devices). But externally, 95% of data centres were locked into the Nvidia ecosystem. 
  2. The future of superintelligence spans from data centres to the edge. AI can’t reach its full potential living only in pre-trained text corpora; it has to operate in the real world. 
  3. Hardware monopolies create economic distortions that harm the ecosystem. When one company dominates silicon, prices skyrocket due to pure supply-and-demand dynamics. More hardware equals more competition, which is good for everyone.

So what is Modular?

“Modular is a unified compute platform,” says Tim. “Our goal is to become a hypervisor for compute”. 

“The secret is no one cares about the hardware. Most developers will say they have a latency, throughput, accuracy and cost target. Help me!”. 

In practice, this means Modular’s platform (MAX) and programming language (Mojo), let AI teams write one and deploy anywhere: Nvidia, AMD, Intel or custom silicon. The company recently demonstrated this with benchmark’s showing AMD’s MI355X outperforming Nvidia’s Blackwell GPUs when running on Modular’s stack, proof that hardware portability doesn’t require sacrificing performance. 

It’s a playbook that draws from decades of computing history. Just as Windows created an operating system that let hardware vendors plug in while developers built on top, and Android did the same for mobile. It’s the same for AI infrastructure: whoever controls the software layer that makes heterogeneous hardware accessible wins the next era. 

The difference is the stakes. The AI infrastructure market is measured in hundreds of billions. Getting this layer right determines whether AI compute remains a monopoly or becomes a competitive market. 

Australia’s opportunity and its hesitation

The infrastructure gap Modular is addressing isn’t just a Silicon Valley problem. Tim sees it playing out in starker form here in his home country, Australia. 

For Tim, the shift from chasing ever-larger models to building infrastructure that unifies compute across an increasingly fragmented hardware landscape, has implications that extend well beyond technology design. It raises a more uncomfortable question about where the next generation of AI companies will be built and whether Australia intends to be one of those places. 

When he first left Australia in 2011, the local startup ecosystem was still finding its footing. Capital was scarce, ambition was often constrained by pragmatism and building globally significant technology companies like Atlassian from Australia felt like the exception, not the rule. 

“I had a conversation with an angel back then who said, ‘I’ll give you half a million bucks for 50% of your company’,” says Tim. 

Nowadays, that picture looks very different. There’s capital, talent and a global mindset from day one. On paper, many of the historical constraints have been removed. 

What remains, Tim argues, is the hesitation to deploy what he describes as “the envy of the world” in terms of capital: the $4.3 trillion sitting in superannuation funds. 

“I wish there was more capital allocation and a vision set by the government that sees the opportunity to be part of the AI race and they can provide the picks and the hammers for the rush of gold.” 

“Why is the government not declaring that we have a huge opportunity in this country?” he asks, listing advantages that seem almost too obvious: massive land availability, the potential to supercharge power infrastructure and a geographic position that could make Australia the fundamental APAC superpower for AI compute as neighbouring countries are landlocked and constrained. 

The numbers tell a stark story. China brought 700-800 gigawatts of power online in the last two years; 56% of it is renewable. The United States added 100-120 gigawatts. Australia? Between 10 and 15 gigawatts. Yet the government’s national AI plan projects $100 billion in data centre infrastructure, which would require 60-70 gigawatts to realise. “The gap is enormous,” Tim says. “It’s not even remotely close.”

Ultimately, he believes Australia needs to be willing to throw its hat in the ring and realise the opportunity in front of us. 

“I’ll act as an antagonist towards encouraging them to do that.”

Whether Australia makes its move remains to be seen, but the infrastructure race Tim is describing won’t wait. The companies and countries that control the next layer of AI compute will shape how this technology develops for decades. 

More articles
No items found.