Every macro investment cycle has a defining blind spot. In the early days of digital television, the market initially treated video data as a simple sequence of static images. They failed to realize that moving from a single standard frame (~100 KB) to a continuous stream of 30 or 60 frames per second exploded data requirements exponentially to megabits per second.
Today, the global investment community is making the exact same error with Artificial Intelligence.
The current market narrative is completely obsessed with Large Language Models (LLMs), text-based reasoning, and conversational chatbots. Wall Street looks at the billions being spent on data centers and asks, “How many text queries do we need to process to justify this CapEx?” They are looking through a narrow, one-dimensional lens. The next 10 to 20 years of AI infrastructure won’t be defined by text tokens. We are on the precipice of a structural, order-of-magnitude shift toward high-density video generation, spatial computing, real-world digital twins, and multi-physics engineering simulations. As these workloads scale and compound year-over-year, the traditional cloud infrastructure will face a reckoning. Here is the deeply bullish, first-principles architectural case for why the AI “picks and shovels” supercycle is just getting started.
1. The Physics Shift: Memory-Bound Text vs. Compute-Bound Video
To understand why datacenter demand will quadruple year-over-year, we have to look at how data utilizes a microchip.
Text inference is fundamentally memory-bound. When you type a prompt into an LLM, the GPU spend a massive amount of time waiting for model weights to load from high-bandwidth memory into its cache. Because the chip’s mathematical units are idling during this wait, data centers can “batch” dozens of different users’ text requests simultaneously onto a single GPU.
Video is entirely different. Video is compute-bound.
When an AI model generates video (like OpenAI’s Sora or Meta’s Movie Gen), it isn’t processing words; it is managing millions of spatio-temporal visual patches across multiple dimensions.
- Generating just 10 seconds of high-fidelity video spits out roughly 1.8 GB of raw pixel data.
- The GPU’s mathematical units are pinned at 100% capacity for the duration of the render.
- Concurrency drops to one. You cannot share that GPU; in fact, you often have to shard a single video generation job across an entire cluster of tightly interconnected GPUs.
If we assume that over the next decade, just half of the global creator economy’s daily video uploads (such as YouTube or TikTok) leverage AI augmentation or generation, the math dictates a baseline requirement of hundreds of thousands of top-tier GPUs pinned 24/7. That represents hundreds of megawatts of continuous power just to act as the “printing press” for the next generation of media—before a single viewer hits play.
2. Entering the Physical World: The Heavy Industrial “Simulation Factory”
While creative video drives immense data volume, moving AI into scientific discovery, 3D CAD engineering, and real-world modeling drives extreme computational density.
When an AI helps design an airplane wing, a consumer electronics chassis, or an autonomous vehicle world-model, the operational mandate shifts from visual plausibility to deterministic physical precision.
[ Traditional LLM Datacenter ]
Low Intra-Network Traffic -> Many Independent Users -> High Memory Bottleneck
[ AI Industrial Simulation Factory ]
Massive Inter-GPU Traffic -> High-Precision Math (FP64) -> Total Compute Saturation
The Precision Wall ($FP64$)
Text and video models thrive on low-precision math ($FP8$ or $FP16$) to maximize speed. If a video model makes a rounding error, a pixel shifts slightly, and the human eye ignores it.
But if an engineering AI makes a rounding error while calculating stress tensors on a bridge or a biomedical implant, structures fail and lives are lost. Engineering AI requires Double Precision ($FP64$) mathematics. When forced to compute at this level of accuracy, the theoretical throughput of standard AI chips drops drastically, requiring a massive increase in physical silicon footprint to process the same volume of work.
The Dual-Engine Loop
Furthermore, an engineering AI cannot operate in a vacuum. It triggers a continuous computational loop: the AI generates thousands of geometric permutations (represented as dense, complex boundary graphs), which must immediately be fed into traditional multi-physics solvers (Finite Element Analysis and Computational Fluid Dynamics) to verify structural integrity.
This requires an entirely new class of “Industrial Simulation Factories”—datacenters featuring massive internal network switching capacities to allow thousands of computing nodes to sync instantly without data starvation.
3. The Ultimate Edge: The Case for Space-Based Datacenters
The sheer weight of these workloads will eventually push computing past the boundaries of Earth’s grid limitations. While space-based datacenters would be useless for low-latency text chats, they possess a massive structural advantage for the world’s hardest asynchronous workloads.
- Orbital Edge Ingestion: Earth observation satellites capture terabytes of raw hyperspectral and LiDAR data every hour but are choked by narrow downlink pipelines to ground stations. Processing this data in orbit via specialized spatial AI allows systems to strip out noise and compress critical data by 100x before sending it down.
- The Energy Wall Solution: Terrestrial data centers face immense public and regulatory backlash over land, water, and power grid strain. In orbit, data centers can harvest unattenuated, continuous 24/7 solar energy that is 30% more intense than on Earth, unlocking infinite power for multi-week molecular and material science simulations.
4. The Picks and Shovels: Where the Alpha Generates
Because the market is treating AI as a monolithic block of conversational software, the structural enablers of this multi-dimensional transition are deeply mispriced. As workloads double and quadruple year-over-year, the real value accumulates at the physical layers:
The Write-Heavy Storage Fabrics
Text is read-heavy; spatial digital twins and video generation are intensely write-heavy. Logging parametric CAD histories, multi-layered physics simulation states, and massive video patches requires specialized, parallelized NVMe storage architectures capable of millions of IOPS. Companies that control high-performance enterprise data storage fabrics are sitting on an unappreciated goldmine.
The Physics Verification Moats
An AI cannot invent the physical world without a ground-truth validator. The legacy software monopolies that own the proprietary multi-physics simulation engines (FEA/CFD) hold a structural bottleneck. Every generative engineering tool must pay a mathematical licensing tax to these engines to verify its designs.
High-Precision and Interconnect Silicon
As clusters are forced to act as single, synchronous machines to process massive geometric graphs, the bottleneck shifts from chip speed to interconnect bandwidth and high-precision processing. Ultra-low latency network switches, electro-optical interconnects, and high-bandwidth memory (HBM) suppliers will experience structural tailwinds that standard software-as-a-service providers cannot duplicate.
The Next 20 Years
We are moving away from the era of centralized “chatbots” and entering the era of generative physical reality. The total mega-capacity of datacenters required to act as the design, simulation, and rendering engine for the physical and visual world is staggering.
The infrastructure boom isn’t nearing its end; it is merely graduating from words to worlds. The picks, shovels, chips, and storage architectures enabling this multi-dimensional regime shift represent the defining investment opportunity of the next two decades.
