Nature is Laughing at the AI Build Out

I was catching up on Acquired episodes this week and listened to the spectacular “Google: The AI Company” episode that Ben and David put together. What the two of them have created with their podcast fascinates me in the inevitability of its success. I think Acquired proves that if you are smart and work your ass off at creating pure value, they will, in fact, come.

But what struck me in the episode is a quote that Ben shared from Greg Corrado from Google Brain, who said that in nature the way things work is the most energy efficient way they could work.

The transformers paper and all the work that preceded the paper has led us to a fairly effective way to emulate some of what the human brain does. And I think it’s clear that most of the value that LLMs are delivering, and will deliver, is a cognitive supplement or replacement for many of the functions of the human brain.

So we’re competing against nature. AGI is a declaration of intent to try to replace the human brain. We’ve made it clear that we’re coming after mother nature’s design.

We’ve got a ways to go. The human brain uses 20 watts of electricity. I have three RTX 5090s here in my basement, and one of them consumes 800 watts when its working hard, if you include the CPU and chassis power. And while the RTX 5090 is a Blackwell powered Beast with 32G of VRAM, the models that it’s capable of running don’t even come close to competing with something like GPT 5.2, Claude Opus 4.5 or Gemini 3 Pro.

It would be unfair for me to ignore concurrency, throughput, 24 hour availability, specialization and other capabilities that differentiate these models from the capability of a single human consuming 20 watts. We need sleep, we suck at multitasking and refuse to admit it, and we tend to work slowly on hard problems. But we’re still doing a lot with our 20 watts.

The IBM 7090 was a groundbreaking transistorized computer available in 1959. I think with LLM’s and AI as a problem space, we’re in the IBM 7090 era. The 7090 consumed around 2500 square feet of space.

“I think there is a world market for maybe five computers.”
Thomas J. Watson, IBM chairman, 1943

The cellphone you’re probably reading this on has 21 million times the processing power of the IBM 7090.  Humans are widely considered to be bad at having an intuitive understanding of exponentiation. Today we think in terms of large foundational models from providers with market caps of hundreds of billions or trillions of dollars, who consume hardware created by a company worth trillions of dollars. Today we have three IBM 7090s in the AI space: Anthropic, OpenAI and Google. Thomas Watson might suggest we have space for two more.

It is clear to me that we’re in the IBM 7090 era of AI. Consider the anomaly that is Nvidia. Geoffrey Hinton, Alex Krizhevsky and Ilya Sutskever, working out of Geoff’s lab at the University of Toronto in 2012, went out and bought two NVIDIA GeForce GTX 580 GPUs to try to win the ImageNet contest, and managed to get a 10% improvement over previous models through the parallelization that CUDA provides.

In 2012, gaming was overwhelmingly the biggest market for GPUs. Today NVIDIA’s data center sales are $51.2B for the latest quarter, and $4.3B for gaming.

NVIDIA’s existence and success is an acknowledgement that we need to fundamentally reinvent the computers on our desks and in our hands. Ultimately VRAM and RAM will no longer be differentiated, and like the Blackwell architecture, memory bus width will be 512 bit or more, with a massive amount of memory bandwidth — 14 Tbps on Blackwell RTX 5090 today – along with CUDA style parallelization.

But the real reason I wrote this post is regarding the so called “AI Bubble”.

I’ve actually been a bubble skeptic for some time now. At times I’ve considered that we may be underestimating the value that AI will deliver and that this will be a rare anti-bubble, where market participants profoundly regret underinvesting. But recently my view has been shifting.

What concerns me is the build-out we’re seeing in power and data centers. We are making potentially flawed assumptions about a few key things:

  1. GPUs will continue to be extremely power hungry, based on the power investments we’re seeing. Today a single DGX B200 node has 8 X B200 GPUs, consuming around 8KW of power for the single node. It has 1440GB of VRAM. Absolute best and completely naive case, Anthropic and OpenAI are using a single DGX B200 node per model instance. But my guess is that they’re using several chassis with Infiniband interconnect per model instance they’re hosting.
  2. Model hosting on GPU will continue to be very expensive, based on NVIDIA’s market cap. Today a single DGX B200 chassis with 8XB200 GPUs costs around $500,000.
  3. Model hosting will continue to occupy more space than normal year over year cloud infrastructure growth. Today the WSJ is covering the risk that real estate investors are exposed to in the AI boom.
  4. The power and compute hungry model architectures that we are using today will be comparable to the power and compute needs of future AI model architectures.

What I think will happen is this:

AI compute will move out of GPU and be part of every computer we build, whether desktop, handheld or data center. The GPU will become a historical curiosity.

Hosting models that rival human intelligence, capabilities and human output on every computer will be the norm. 

Cloud hosting will not go away, and cloud hosting of extraordinarily powerful models for specialized tasks and use cases will be a big business, but this will be in addition to local capabilities, and the ubiquity of models and the hardware that supports them.

The power, space and cost of cloud AI hosting will plummet over the next three decades, with stair step gains multiple times per year. The DeepSeek phenomenon and how it terrified the markets is a harbinger of what is to come. And we will see this in model runners, model architectures, use of existing hardware, and hardware innovation.

The CUDA monopoly on programmability of GPUs will end, and the availability of programmable high performance, low power AI compute across all computers will be solved. NVIDIA has a gross margin on data center GPUs today of approximately 75% which is an absurd anomaly based on the weird coincidence that gaming graphics compute is fairly well suited to the problem of parallelizing AI compute.

And finally, and perhaps the most profoundly terrifying prospect: Model architectures will become more power and compute efficient, leading to sudden drops in hardware and power needs, amplified by the increasing compute and power efficiencies in newer hardware.

So where does this leave us when evaluating the bubbliness of the so called AI bubble today?

  • Data center real-estate may be overinvested.
  • Power for data centers may be overinvested.
  • The ongoing cost of AI compute, and the profits that accrue to NVIDIA long term may be overestimated.
  • The cloud-hosted-models business will lose market share to on-device models.
  • The concentration of AI capability among foundational model providers won’t last as compute adapts and many models move on device, with the low-latency benefits that provides.

And finally: As we see a stair step improvements in AI algorithms, we will see precipitous drops in market capitalizations among hardware vendors, power providers, real-estate investors and foundational model providers.

If mother nature has it right, it’s possible to host something equivalent to human level intelligence using only 20 watts of power, in a space equivalent to the inside of your skull.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *