Industry forecasts are sounding the alarm. Hundreds of billions of dollars will need to be invested in new data centers in the coming years to keep pace with AI’s insatiable computational needs.
Bain & Company projects that this annual spend will reach $500 billion per year by 2030. Projects like the Softbank, OpenAI and Oracle-led Stargate initiative are being set up to meet this demand.
Article continues below
You may like
Zuzanna Stamirowska
Social Links Navigation
CEO and co-founder of Pathway.
More compute. More layers. More data. This is the logic that we see and hear today. This sets us on an unsustainable trajectory, creating a challenge which is no longer tomorrow’s concern. An AI energy crunch is a near-term inevitability if we remain locked into the path set by the prevailing wisdom of upwards scale.
Being wedded to that trajectory also ignores a hard truth: unsustainable demand is hardwired into the unit economics of today’s LLMs and, more specifically, the transformer architecture that comes with high energy consumption baked into training and inference.
The emergence of reasoning models, that produce more tokens in hidden “thinking” text before giving answers, adds to the problem. A 2025 study found that while a long prompt to GPT-4o consumes 0.42 Wh, a long prompt to reasoning models can consume as much as 33.634 Wh (DeepSeek-R1) and 30.495 Wh (GPT-4.5). That’s enough power to charge a smartphone, and then some, per prompt.
Spiraling token usage and inference compute cost as the only means to improve models creates a massive question mark about the current economic viability of the market in AI tools. We dramatically need a paradigm shift to course correct. That shift is already underway — and it begins with leaving the transformer behind. Welcome to the post-transformer era.
Consequences of chain of thought reasoning
Increased adoption of LLMs alone doesn’t explain the runaway compute and power demands of AI companies. It’s also a result of the fact that LLMs are consuming more and more tokens over time.
The first generation of LLMs generated a few hundred tokens per response. In comparison, today’s ‘reasoning’ models burn through thousands of thinking tokens to map out step-by-step logic in chain-of-thought prompting.
But reasoning came as an afterthought to the core architecture that remained untouched since the beginning – the transformer. Transformer-based models are incredibly resource-intensive.
What to read next
They use massive amounts of compute as they process information in a way that grows quickly as data increases, need very fast memory to run and constantly adjust and account for a model’s parameters.
Naturally, sticking to the transformer without meaningful efforts to introduce mechanisms for efficient reasoning has pushed the compute and energy requirements of LLMs dramatically upwards. Has that trade-off resulted in far more intelligent models?
The growing consensus is no. All the signs point to today’s transformer-based models hitting a ceiling in terms of functionality, with diminished returns despite ever-growing model scale. GPT-5, which underwhelmed at launch even though it took over two years to train, is the prime example.
Recent launches from today’s leading labs have generated excitement, but none have brought about the architectural shift the industry needs nor a true step change in functionality. Outside of the scaling race, true alternatives are gathering steam.
Taking cues from biological systems
The answer lies in nature’s most elegant design: the human brain. The human brain is a shining example of a natural system that can achieve advanced cognitive feats with remarkable energy efficiency. Inspired by this, new, challenger AI companies, are replicating how humans process information in a new architecture for LLMs to close the efficiency and learning gap that separates today’s AI from the brain’s computational model.
The brain operates through under 100 billion neurons acting as computational units, linked by hundreds of trillions of synapses that store and transform information. The number of synapse connections into each neuron varies. Some act as central hubs with countless connections, and others have significantly less.
This dynamic remains true even as the brain grows and changes – making the brain the quintessential scale-free network. Remarkably, the brain uses just 20 watts of power to power this network of neurons. This is equivalent to the power of a dim lightbulb and a drop in the ocean compared to the colossal energy demands of today’s main LLMs.
In 2025, we launched the Dragon Hatchling (BDH) architecture to close the gap between AI models and the brain by replicating neuron-synapse dynamics in LLMs. Where transformer-based models draw on static, pre-trained weights (the same fixed knowledge base every time) to predict outputs, BDH works differently.
Only the artificial neurons relevant to the task at hand are activated, and their connections strengthen or weaken based on experience garnered from an enterprise deployment. This is Hebbian learning in practice: ‘neurons that fire together, wire together’, mixed with synaptic plasticity, just like in our own brains.’ The result is a model that builds intelligence through use, not just through training, via local interactions.
It has a very compact and efficient network-like structure, which keeps knowledge exactly in the spot where it’s being processed, and enables continuous learning without the regular retraining cycles that transformer-based models depend on.
This is a much-needed alternative approach to improving LLM reasoning without burning through tokens. Enterprises can do far more with LLMs before hitting token limits as entire models don’t have to run every time they’re put into use, just the neuron connections that are relevant and needed for the task at hand.
And AI companies finally have a solution to both improve model reasoning capabilities and break from the spiraling energy consumption of the transformer that isn’t sustainable. Crucially, BDH is compatible with the general-purpose hardware that has made the LLM boom possible.
Rethinking architecture is the only way to tackle root causes
An AI energy crunch is a bleak scenario but it’s not an inevitable one. Rather, it’s a scenario based on architectural choices that can be undone. The transformer changed the world. Reasoning capabilities added to LLMs in recent years have succeeded in improving model functionality.
But the underlying unit economics aren’t viable. Nature already offers a solution to the problem of efficient, adaptive intelligence and the ethos of the post-transformer era lies in taking the right lessons from this. We can build AI systems that get smarter through use, in a sustainable way that doesn’t depend on infinitely upwards scale.
For enterprises, this isn’t just an environmental story — it’s an economic one. We’re looking at 10x or more reduction in inference costs. AI that reasons efficiently, learns continuously, and requires a fraction of the compute isn’t a future aspiration. It’s what post-transformer architecture delivers today.
We’ve featured the best Large Language Models (LLMs) for coding.

