The Professor of Outputmaxxing — Anjney Midha, AMP
Anjney Midha discusses AI compute waste, the importance of utilization metrics like node allocation and MFU, and AMP's vision for a compute grid that makes FLOPs flow like megawatts. He advocates for responsible infrastructure, community incentives, iterative scaling, and alignment between capital and execution to address AI's real bottleneck: system efficiency.
Last 4 days before regular tickets sell out at AI Engineer World’s Fair - this is the single biggest gathering of AI Engineers, Founders, Leaders, and Researchers in the world. Attendees get >$5000 worth of sponsor credits and talk tracks are looking FANTASTIC. Join us!
The AI scaling debate always focuses on the question of “how do we get more GPUs?” but the better question may be: how do we make the most of ones we already have.
The fact that a frontier lab like xAI could be running at sub-10% MFU (Model FLOPs Utilization) is just a hint at what the real problem may be.
tweet
For context, older frontier-scale training runs were already much higher than 10%. GPT-3 was around 21% MFU. Gopher was around 32%. Megatron-Turing NLG was around 30%. PaLM reached around 46%. And our guest Anjney says best-in-class MFU today is closer to 60–70%.
PaLM: Scaling Language Modeling with Pathways
It’s not necessarily that xAI is uniquely incompetent (it’s clear they have talented folks) but rather the priorities may be flipped in the GPU arms race.
While GPU access is a bottleneck, simply increasing CapEx won’t automatically translate to better models as frontier AI is increasingly a systems problem: scheduling, utilization, networking, kernels, frameworks, data pipelines, parallelism, cluster reliability, and the thousand small decisions that determine whether your theoretical FLOPs become real training progress.
From building Discord’s developer platform and backing frontier AI companies like Anthropic, Mistral, Black Forest Labs, and Periodic Labs to now building AMP’s independent compute grid, Anjney Midha has spent years close to the real bottlenecks of AI scaling. In this episode, Anjney joins swyx at Periodic Labs to unpack why the AI race is not just about buying more GPUs, why 95% utilization would have been considered an outage at Google, and why the next era of AI infrastructure has to be more aligned, more efficient, and more responsible.
We go deep on AMP’s vision for a compute grid that makes FLOPs flow like megawatts, the difference between full-stack AI labs and horizontal pooling, why AI data centers need community buy-in, and how compute markets could evolve into something closer to an independent system operator. Anjney also explains why DeepMind’s unpublished research points to a market failure, why end-of-life prediction remains one of the most important AI applications he has thought about for fourteen years, and why “output maxing” may become a new discipline for frontier systems.
We also discuss Anthropic’s culture, why “luck favors the prepared mind” in coding models, how Claude cracked coding, why too much capital too early can make AI labs fragile, what Periodic Labs is trying to do with science and superconductors, why great researchers can become great CEOs, and why Silicon Valley is both deeply missionary and deeply mercenary.
We discuss:
Why 95% utilization was considered an outage at Google
Why AI infrastructure waste compounds at frontier-lab scale
Why “move fast and break things” does not work for AI data centers
How data center backlash, power grids, and community incentives shape AI scaling
AMP’s vision for making FLOPs flow like megawatts
Why compute needs an independent system operator
How interruptible demand and dynamic prioritization worked inside Google
Why DeepMind research hoarding creates negative externalities
AMP’s 1.2GW base-load ambition and the need for 6GW of spike capacity
Why end-of-life prediction could become one of AI’s most important healthcare applications
Frontier Systems, output maxing, and full-stack alignment
Why APIs and abstraction layers become lossy as organizations scale
Superconductors, standards, and the dream of lossless systems
SF Compute, open protocols, and the future of compute marketplaces
Why non-NVIDIA chips can still benefit from NVIDIA’s reference architecture
Trust boundaries and why chip startups need visibility into future model architectures
Why VCs often underestimate researchers as CEOs
Scientists as star athletes of the mind
Why great CEOs need to be confrontational up and down the stack
Why leading the frontier matters more than “winning”
How Anthropic cracked coding
Why culture is fragile, not a permanent moat
Why hardship was a feature, not a bug, for Anthropic
Why Anthropic’s P0 was coding from day one
Periodic Labs, physics as the constraint, and technical reality
Silicon Valley mercenaries, missionary teams, and what happens after a breakthrough
Anjney Midha
LinkedIn: https://www.linkedin.com/in/anjney
X: https://x.com/AnjneyMidha
AMP PBC
Website: https://amppublic.com/
X: https://x.com/amppublic
Timestamps
00:00:00 Introduction
00:00:09 Why AI Compute Is Being Wasted
00:03:17 Responsible Infrastructure and Data Center Backlash
00:06:07 AMP Grid: Making FLOPs Flow Like Megawatts
00:12:41 Foundry, Frontier Labs, and Research Hoarding
00:14:42 Gigawatt-Scale Compute and End-of-Life Prediction
00:24:08 Frontier Systems, Output Maxing, and Alignment
00:27:38 Compute Markets, SF Compute, and Non-NVIDIA Chips
00:32:57 Trust Boundaries, Co-Design, and Researcher CEOs
00:38:17 AI Coachella and First-Principles Thinking
00:42:43 Leading vs Winning in Frontier AI
00:45:54 How Anthropic Cracked Coding
00:48:25 Culture, Hardship, and Anthropic’s P0
00:54:03 Periodic Labs, Physics, and Silicon Valley Mercenaries
00:56:26 Rishi Valley, Singapore, and Money as a Measure
00:58:47 Closing Thoughts
Transcript
Introduction: Anjney Midha, AMP, and Compute Waste
Swyx [00:00:00]: We’re in Periodic Labs with Anjney Midha, CEO, founder of AMP. Welcome.
Compute Utilization: Node Allocation, MFU, and Alignment
Anjney [00:00:09]: Thanks for having me. At Google, there are two types of utilization usually, right? That you’re measuring in these clusters. One is node allocation, and then the other’s MFU. Node utilization is usually like what percentage of cards in the data center are just, used, and that, if it’s not at, 95%-
Swyx [00:00:29]: There is no excuse
Anjney [00:00:29]: There’s no excuse, right? I think 95% at Google, which is where my co-founder, Seb, came from, he built the Borg, PBorg/GQM scheduler at Google, and there I think 95% was considered an outage, so 96% node utilization is, should be standard. And most single-tenant clusters are not running at that. So that’s one. And then MFU should be, I would say the best in class today is somewhere between 60 and 70%. I think this is a leadership question, right? Fundamentally it’s an alignment question, which is are the people who are funding the cluster and then deploying the cluster actually aligned? And sometimes theoretically they are, but in practice the number of people in the chain, the supply chain between, the capital and all the way to whoever’s managing the cluster and then whoever’s measuring what the output is, are just so many, degrees of separation away that, the, The Have you ever heard the radian metaphor, which is at the beginning of an arc, if you have two arcs that are two lines that are just off by a few degrees, that-
Swyx [00:01:33]: It spreads out
Anjney [00:01:34]: It spreads out, right? Or at scale. And I think what’s happening is a lot of cluster implementations and infrastructure, a lot of frontier labs and other teams, that’s what’s happening, is they’re, they initialize the plan, which is kind of like North Star with a team that wants to do good, but then they’re, required to scale so fast instead of iteratively that the wastage just compounds really fast at scale. And so I think we know the answer, which is just do iterative bring ups. If you spend time with people who’ve been in the semiconductor industry or the DSN industry for a long time, this is not new, and I don’t think AI should be an excuse. Sure. Something What is new? Okay. We have a lot of new capabilities, but that doesn’t mean just abandon common sense. Common sense should always be in fashion. ? AI scaling doesn’t change the in fact, if anything, AI scaling should be putting a premium on the value of common sense and infrastructure because the margin of error now is so much lower and the costs of wastage are so much higher. And the cost of wastage, by the way, is not just economic. I’m, obviously I’m, I’m an investor, or I’m an investor by background. Over the last few years now we’re running an AI infrastructure business called, AMP. And I think that it’s okay to say this time is different on the capabilities front. We are genuinely getting capabilities at, of the, of a kind we haven’t had before. That doesn’t give you an excuse to say this time is different for everything, especially infrastructure. So look, I love the hacker mindset and the hustler mindset. Now, that’s great for the startup mindset, but you remember this moment where Zuck went from saying, “Move fast, break things” to, move-
Responsible Infrastructure and Data Center Backlash
Swyx [00:03:10]: Fast and stable infrastructure
Anjney [00:03:11]: Move fast with stable infrastructure. I think now we need to move fast with, responsible infrastructure. People are going to ask where the impact is. There was a really In our class yesterday, Scott Nolan, who’s the founder of General Matter, came by at Stanford to speak about energy bottlenecks. And he had a phenomenal idea. He said, “if you look at the marginal unit economics of compute per hour,” he goes, “let’s call it, $4 an hour. If you’re having to bring up a new data center in a new community, why not just say we’re going to charge 4.50 an hour, and that marginal impact or that marginal increase, we just literally take that and give it to the local community as cash?” I can tell you as a customer of that compute, I would love that. I’d be happy to pay an additional 50 cents per hour at scale.
Swyx [00:03:57]: Wow. Yeah.
Anjney [00:03:58]: Because if that means the public benefit is so clear to the communities that the data centers are coming up in, I’m going to feel like that compute is much more reliable. Up to 20% of all data centers this year in the US, my understanding is are at risk.
Swyx [00:04:13]: Of community backlash?
Anjney [00:04:14]: Correct. Of not getting the community support they need to get brought up.
Swyx [00:04:19]: Wow. That’s a huge number.
Anjney [00:04:20]: Yeah. Now, we, I think we should dig into what that number is. I think it’s a little bit of overstated. These things can get over-reported, but it-
Swyx [00:04:27]: They don’t just care about jobs. They care about all the other stuff around it, right? They care about power grid, they care about environments-
Anjney [00:04:33]: Power grid, permitting, and so on. And imagine I think if you said there’s a new AI deal. If we’re bringing up a data center in your community, we’re actually going to reduce the cost of your electricity bill. Okay, now we’re talking. Right? The community’s going, “Okay. Now this is a deal. I feel like a partner in this.” Right now that’s not happening. There will be audits, there will be investigations, and when the, when the regulators come, I don’t know when it’s going to be, the folks who are moving fast and breaking things in the name of AI progress better be prepared. That’s certainly not how we’re procuring compute. Or we’re, we’re trying as much as we can to work with partners who have long-term track records. Many of whom, by the way, are not, AI providers. I think this whole idea of neoclouds being somehow this new category is a lot of marketing speak. There are really good, reliable, trusted data center providers in America who’ve been around 20 plus years. I love those folks. They know how to Sure. Are they sponsoring happy hours at NeurIPS? No. Are they legibly listed in Build? No. Are they hanging out in my, in, situational awareness parties? No. But they’re adults. I trust them.
Swyx [00:05:44]: They can run LAN. They can run
[truncated for AI cost control]