We've seen the valuations of a bunch of software companies crash, because peo...
And there's a potentially naive way of thinking about things, which is like, look, NVIDIA sends a GDS2 file to TSMC, TSMC. Those logic dies, it builds the switches, then it packages them with the HBM that ask you hi-nex and micron and Samsung make, then it sends it to an ODIM and Taiwan where they assemble the racks. And so NVIDIA is fundamentally making software that other people are manufacturing and if software gets commoditized, NVIDIA can monetize, NVIDIA can monetize.
Well, in the end, something has to transform electrons to tokens. That transformation, there's no, the transformation of electrons to tokens and making those tokens more valuable over time.
“I don't, I think that that's hard to hard to completely commoditize.”
The transformation from electrons to tokens is such an incredible journey.
And making that token, you know, is like making one molecule more valuable than another molecule, making one token more valuable than the amount of artistry, engineering, science, invention that goes into making that token valuable. Obviously, we're, we're watching it happening in real time. And so, so the, the transformation, the manufacturing, all of the science that goes in there, it is far from, deeply understood and it's far from, that journey is far from, far from over.
And so, so I, I doubt that it will happen. We're going to make it more efficient, of course. I mean, the whole, the whole thing about NVIDIA, in, in fact, the way that you framed the question is, is my mental model of our company. The input is electron, the output is tokens. That is, in the middle NVIDIA, and our job is to, to do as much as necessary as little as possible,
to enable that transformation to be done at an incredible capabilities.
And, and what I mean by as little as possible, whatever I don't need to do, I partner with somebody, and I make it part of my ecosystem to do. And if you look at NVIDIA today, we probably have the largest ecosystem of partners, both in supply chain upstream, supply chain downstream, all of the computer, computer companies, and all the application developers, and all the model makers, and all the, you know,
AI is a five year, five-layer cake, if you will. And, and we have ecosystems across the entire five layers. And, and so we try to do as little as possible. But the part that we have to do, as it turns out, is insanely hard. And, and I don't think that that gets commoditized.
In fact, I also don't think that the, the enterprise software companies, the tools makers. You know, most of the software companies today are tools makers. Some of them are not, but some of them are workflow, codification, you know, systems. But for a lot of companies, they're tool makers. For example, you know, excels a tool, power points a tool, cadence makes tools, synopsis makes tools.
I actually see the opposite of what people see.
“I think the number of agents are going to grow exponentially.”
The number of tool users are going to grow exponentially. And it's very likely that the number of instances of all these tools, are going to skyrocket. It is very likely that the number of instances of synopsis design compiler is going to skyrocket. And the number of, number of agents that are going to be using the floor planners and all of our layout tools
and hard design, design role checkers, the number of agents that are today were limited by the number of engineers. Tomorrow those engineers are going to be supported by a bunch of agents.
And we're going to be exploring up the design space like you've never seen explore before and want to use the tools that we use today.
And so, so I think I think tool users going to cause cause these software companies to skyrocket. The reason why it hasn't happened yet is because the agents aren't good enough at using their tools yet. And so either these companies are going to build the agent themselves or agents are going to get good enough to be able to use those tools.
“And I think it's going to be a combination of both.”
I think in your latest filings it was you had almost a hundred billion dollars in purchase commitments with. People foundries, memory, a packaging and then semi analysis as reported that you will have 250 billion dollars with these kinds of purchase commitments. And so one interpretation is Nvidia's mode is really that you've locked up many years of these scarce components that are. You know, somebody else might have an accelerator, but can they actually get the memory to build it, can they actually get the logic to build it.
And this is really in media's big mode for the next few years.
Well, it's one, it's one of the things that we can do that is hard for someon...
The reason why we could, we've made enormous commitments upstream.
Some of it is explicit these commitments that you mentioned. Some of it is implicit. For example, a lot of the investments that are upstream are made by our supply chain because I said to the CEOs. Let me tell you how big this industry is going to be and let me explain to you why and let me reason through it with you and let me show you what I see. And so as a result of that that process of of informing inspiring a lining with CEOs of all different industries upstream.
They're willing to make the investments.
“Now, why are they willing to make the investments for me and not someone else?”
And the reason for that is because they know that I have the capacity to buy it, buy their supply and sell it through my downstream. The fact that Nvidia's downstream supply chain and our downstream demand is so large. They're willing to make the investment upstream. And so if you look at GTC and people are marveled by the scale of GTC and the people that go. It's a 360 degrees that the entire universe of AI all in one place.
And they're all in one place because they need to see each other. I bring them together so that the downstream could see the upstream, could see the downstream.
“And all of them could see all the advances in AI.”
And very importantly, they can all meet the AI natives and all the AI startups that are all being built and all the amazing things that are happening. So that they could see firsthand all the things that I tell them. And so I spend a lot of my time informing directly or indirectly our supply chain and our partners and our ecosystem about the opportunity that's in front of us.
You know, most of my keynotes, some of you will always say, you know, Jensen in most keynotes is like one announcement after another announcement after another announcement after another announcement.
Our keynotes are, there's always a part of it that's a little torturous in the sense that it's almost comes across like an education. And in fact, that's exactly on my mind. I need to make sure that the entire supply chain upstream and downstream, the ecosystem, understands what is coming at us, why is coming, when is coming, how big is it going to be, and be able to reason about it systematically, just like I reason about it.
“And so, so I think the, the, the mode as you describe it, we're able to, of course, build for a future, if our next next several years is a trillion dollars in scale, we have to supply chain to do it.”
Without our reach, the velocity of our business, you know, just as there's cash flow, there's supply chain flow, there turns. Nobody's going to build a supply chain for an architecture, if the architecture, the business turns as low. And so our ability to sustain the scale is only because our downstream demand is so great, and they see it, and they all hear about it, they see it all coming. And so that's, it allows us to do the things that we're able to do at the scale we're able to do. I do want to understand more concretely whether the upstream can keep up for many years now, you guys have been two exiting revenue year over year, you guys have been more than tripling the amount of slots you're providing to the world year over year.
And two exiting at the scale now, it's really, exactly. And then you look at logic say, you're the biggest customer on TSMCs and three node, and you're one of the biggest on it and two AI and as a whole, this year is going to be 60% of N3, it's going to be 86% next year according to something analysis. How how do you two x, if you're the majority, and how do you do that year over year. So are we, are we in a regime now where the growth rate in the I compute has to slow because of upstream, do you see a way to get around these, you know, how do we build two x more fabs year over year ultimately.
Yeah, at some, at some level, the, the instantaneous demand is greater than the supply upstream and downstream in the world. And, and it could be at any instant in any instance, we could be limited by the number of plumbers, which actually happens. The plumbers are invited to next year's GTC. Yeah, you know, by the way, great idea. But that's a good condition.
You want, you want, you want to market, you want an industry where the instantaneous demand is greater than the total supply of the industry.
The opposite is obviously less good.
If we're too far apart, if one particular item, one particular component is too far too far away.
Obviously, obviously, the industry swarms it. So for example, notice people aren't talking very much about co-oss anymore. Yeah. And the reason for that is because for two years, we swarmed a living day-lights out of it.
“And we double double double on on several doubles, and now I think we're in a fairly good shape.”
And TSMC now knows that co-oss supply has to keep up with the rest of the logic demand and the memory demand. And so, so there's scaling co-oss, and there's go scaling, you know, future packaging technologies at the same level as a scale logic, which is terrific. Because for a long time, co-oss was rather specialty, and HBM memory with rather specialty. But they're not specialties anymore. People now realize they're mainstream computing technology.
And then, and then, of course, we're now much more able to influence a larger scope of our supply chain. In the past, in the past, you know, in the beginning of the AI revolution, all the things that I say now, I was saying five years ago. And some people believed in it and invested in it, for example, Sanjay and the micron team. And I still remember the meeting really well, where I was clear about exactly what's going to happen, and why's going to happen, and the predictions, the predictions that of today.
And they, they really doubled down on it, and we partnered with them, and across LPDDR, across, you know, HBM memories. They really invested in it, and it obviously has been tremendous for the company. Some people came a little bit later, and, but now they're all here.
“And so, I think each one of these generation, each one of these bottlenecks gets a great deal of attention.”
And now we're pre-fetching the bottlenecks years in advance. So, for example, the investments that we've done with, with Lumentum and coherent, and all of the silicon photonics ecosystem, the last several years, we really reshape the ecosystem in the supply chain, silicon photonics. We built up an entire supply chain around TSMC. We partnered with them on Koop, invented a whole bunch of technology.
We licensed those patents to the supply chain, keep it nice and open. And so, we're preparing the supply chain through invention of new technologies, new workflows, new testing equipment, double-sided, probing, investing in companies, helping them scale up their capacity. And so, so you could see that we're trying to shape the ecosystem so that it's ready to supply chain, so that it's ready to support the scale. It seems like some bottlenecks are easier than others, and so, scaling up Koos versus scaling up.
I went to the hardest one by the way. Which is plumbers. Yeah. Yeah, I actually went to the hardest one. Yeah, plumbers and electricians. And the reason for that is because, because, and this is one of the concerns that I have about the dooms,
describing the end of, end of work and killing of jobs. And, you know, one of the things that, that, that if we discourage people from being software engineers, we're going to, we're not a software engineers. And, and the same prediction 10 years ago, some of the, some of the dooms were, saying that, we're, we're telling people, whatever you do, don't be a radiologist.
And you might hear some of those, some of those videos are still on the web.
You know, radiology is, it's going to be the first career to go.
Nobody's, the world's not going to need any more radiologists. Guess what, we're short of radiologists. Oh, but, okay, so, going back to the point about, well, some things you scale, other things like, how do you actually get, how do you actually manufacture a two XD amount of logic at your, ultimately, that's bottleneck. But, memory and logic are bottleneck, but you view the,
“how do you get to two XD as many UV machines a year, year over year?”
None of that is impossible to scale quickly. You just need, you could do, all of that is easy to do within two or three years. You just need a demand signal. It's not, once you, once you can build one, you could build 10, and once you can build 10, you can build a million.
And so, these things are not, not hard to replicate. How far down the supply chain do you go, where do you go to ASML and say, hey, if I look out three years or now, for me, for Nvidia to be generating two trillion year in revenue, we need way more AUV machines and some of them, I have to directly, some of them, indirectly and some of them, if I can convince TSMC as ASML will be convinced.
And so, that's, that, you know, we have to think about the critical, critical pinch points.
And, but if TSMC is convinced, you'll have plenty of EV machines in a few years. And so, none of that, my point is that none of the bottlenecks last longer than a couple to three years.
None of them.
in the case of Hopper to Blackwell, some 30, 50x.
“We're coming up with new algorithms because CUDA is so flexible.”
We're developing all kinds of new techniques so that we drive efficiency in addition to increasing capacity. And so, so those, those are, those are things that none of that worry me. It's the stuff that's downstream from us. Energy policies that prevent energy from, you know, you can't grow, you can't create an industry without an energy. You can't create a whole new manufacturing industry without energy.
We want to re-industrialize unit states. We want to bring back chip manufacturing and computer manufacturing and packaging. And we want to build new things like EVs and robots and we want to build AI factories. And you can't build any of these things without energy. And those things take a long time.
But more chip capacity, that's a two-three-year problem. More co-oscopacity, two-three-year problem. Interesting.
“I feel like I have guessed, tell me these exact opposite things.”
So in this case, I just don't have the technical knowledge to adjudicate. Well, the beautiful thing is you're talking to the expert. Yeah. That's true, true. Okay.
I want to ask about your competitors. Yeah. So if you look at TPU, arguably, two out of the top three models in the world, Claude and Gemini were trained on TPU. What does that mean for Nvidia going forward? Well, we have a very different thing.
You know, what Nvidia built is accelerated computing. Not a tensor processing unit. And accelerated computing is used for all kinds of things. You know, molecular dynamics and quantum chromodynamics. And it's used for data processing.
Data frames, structured data, unstructured data. It's used for fluid dynamics, particle physics. You know, in addition, we use it for AI. And so accelerated computing is much more diverse. And although AI is the conversation today is obviously very important and impactful.
Computing is much broader than that. And what Nvidia has done is reinvented the way computing is done from general purpose computing to accelerate computing. Our market reach is far greater than any, any TPU, any ace that can possibly have. And so if you look at our position, where the only company that that accelerates applications of all kinds, we have a gigantic ecosystem.
And so all kinds of frameworks and algorithms all run on video. And because our computers are designed to be operated by other people, anyone who's an operator can buy our systems.
“Most of these home-built systems, you have to be your own operator,”
because we're never designed to be flexible enough for other people to operate.
And so as a result of the fact that anybody can operate our systems, we're in every club, including Google Holland, Amazon, and Azure, and OCI. And so whether you want to operate it to rent or operate it, if you want to operate it to rent, you better have large ecosystem of customers, and many industries that be the off-takers.
If you're operating it, if you want to operate it for yourself, we obviously have the ability to help you operate yourself, like, for example, for you along with XAI. And because we could enable operators in any company, in any industry, you could use it to build a supercomputer for scientific research and drug discovery at Lily. And so we can help them operate their own supercomputer and use it for the entire diversity of drug discovery and biological science.
That we accelerate. And so there's just a whole bunch of applications that we can address that you can do so with TPUs. Because Nvidia's built CUDA as a fantastic tensor processing unit as well, but it does every life cycle of data processing and computing and AI and so on and so forth. And so our market opportunity is just a lot larger. Our reach is a lot greater.
And because we have such a large, we basically support every application in the world now, you could build Nvidia systems anywhere and know that there will be customers for it. And so it's a very different thing. This is going to be sort of a long question, but you know, you have spectacular revenue.
And this revenue is mostly you're not making 60 billion a quarter from Farmer and Quantum.
You're making it because AI is unprecedented technology that is growing unpre...
And so then the question is, what is best for AI specifically?
And I'm not in the details, but I talked to my researcher friends and they say, look, when I use a TPU, it's this big systolic array that's perfect for doing major smart applies, whereas a GPU is very flexible, it's a great when you have lots of branching, when you have irregular memory access, but you know, what is AI? It's just like these very predictable matrix multiplies again and again and again.
And you don't have to give up any die area for warp schedulers for, you know, switch between threads and memory banks. And so the TPU is really optimized for the majority, the bulk of this growth in revenue and use case for a compute that is coming online right now. Yeah, I wonder how you reacted that.
Matrix multiplies is an important part of AI, but it's not the only part of AI.
“And if you want to come up with a new attention mechanism, or if you want to disaggregate in a different way,”
if you want to come up with a whole new type of architecture altogether, for example, you know, a hybrid SSM. If you want to use a, you want to create a model that, that fuses diffusion and auto regress somehow, you want an architecture that's just generally programmable. And we run everything you can imagine, and so that's the advantage. It allows for invention of new algorithms a lot more, a lot a lot more easily.
And so because it's a programmable system, and the ability to invent new algorithms is really what makes AI advanced so quickly. You know, TPUs like anything else is impacted by Moore's Law. And we know that Moore's Law is increasing about 25% per year.
“And so the only way to really get 10x leaves, 100x leaves is to fundamentally change the algorithm and how it's computed every single year.”
And that's Nvidia's fundamental advantage. The only reason why we were able to make black while the hopper 50 times.
You know, I said it was 35 times. And, and when I first announced it was going to black while it was going to be 35 times more energy efficient than hopper.
Nobody believed it. And, and then, and then Dylan wrote an article. He said, in fact, in fact, I sandbagged it's actually 50 times. And you can't reasonably do that, which is Moore's Law. And so the way that we solve that problem is new models, MOEs, paralyzed and disaggregated and distributed across a computing system. And without the ability to really get down and come up with new kernels with Kuda, it's really hard to do. And, and so the combination of the programmability of our architecture, the fact that Nvidia's in extreme code design company, where we could even offload some of the computation into the fabric itself and be linked, for example, into the network spectrum X.
And, and that we could affect change across the processors, the system, the fabric, the libraries, the algorithm, all of that was done simultaneously without Kuda to do that. I wouldn't even know where to start. My smartster Crusoe was among the first clouds to offer Nvidia's blackwell and blackwell ultra-platforms, and they just announced their Nvidia Vera room and deployment schedule for later this year. But access to state of the art hardware is only part of the story. For example, most infersentions already do KV-caching for a single user's forward passes, but Crusoe does it across users and GPUs.
So, if a thousand agents are running on the same system prompt, Crusoe only has to compute the KV-cache ones for it to become available to every single GPU in the cluster. This is especially important systems get more authentic and require much longer prefixes in order to use tools and access files.
In a recent benchmark, Crusoe was able to deliver up to ten times faster, time to first token, and up to five times better throughput than VLO1.
“This is just one of the many reasons that you should run your infers workload with Crusoe.”
And if you need GPUs for training, you don't need to switch clouds. Crusoe's got you covered there too. Go to Crusoe.ai/torcache to learn more. So, this gets an interesting question about Nvidia's clientele, where in 60% of revenue is coming from these big five hyperscalers. You know, in a different era with different customers, let's say its professors were running experiments. And they are helped a bunch by they need Kuda. They can't use another accelerator.
They need to just run PyTorch with Kuda and have everything optimized.
But if you got these hyperscalers, they have the resources to write their own kernels.
In fact, they have to to get that extra last 5% that they need for their specific architecture. And throughout the Google are mostly running their own accelerators or running GPUs and training them. But even opening I, using GPUs has tried in which they are like we need their own kernels. So they've down to Kuda C++. They've, instead of using Kuda C++ and Nickel and everything. They've got their own stack, which can pile some other accelerators as well.
And so if most of your customers can, can and do make replacements for Kuda,
to what extent is Kuda really the thing that is going to make Frontier AI happen on Nvidia?
“Kuda is a rich ecosystem. And so if you want to build on any computer first building on Kuda first is incredibly smart.”
And because the ecosystem is so rich, we support every framework, if you want to create custom kernels. If you need, for example, we contribute enormously to trying and so back in of trying huge amounts of Nvidia technology. We're delighted to help every framework become as great as it's going to be and there's lots and lots of frameworks. There's just trying there's BLN, there's S-C-Lang and there's more, right? And now there's, there's a whole bunch of new reinforcement learning frameworks coming out.
You've got there all, you've got Nemo RL, you've got a whole bunch of new. And then the now with post training and reinforcement learning, that entire area is just exploding. And so if you want to build on an architecture building on a Kuda, makes no sense. Because you know that the ecosystem is great. You know that if something happens, it's more likely in your code and not in the mountain of code underneath. You know, don't forget the amount of code that you're dealing with when you're building these systems.
“When something doesn't work, was it you or was it the computer?”
You would like it always to be you and to be able to trust the computer in. And obviously we still have lots and lots and lots of bugs ourselves. But our system is so well, wrong out, that you could at least build on top of the foundation. So that's number one, is that the richness of the ecosystem or programmability of it, the capability of it.
The second thing is, if you were a developer and you were building anything at all, the single most important thing you want more than anything is install base.
You want the software that you run to run on a whole bunch of other computers. You don't want to build a software, you're not building software just for yourself. You're building software for your fleet or for everybody else's fleet because you're a framework builder. And various CUDA ecosystem is ultimately it's great treasure. We are now, I don't know how many several hundred million GPUs. Every cloud has it goes back to eight and eight one hundred, eight one hundred, eight two hundred, you know.
The L series, the P series, I mean there's a whole bunch of them. And they're, they're in all kinds of sizes and shapes. And if you're robotics company, you want that CUDA stack to actually run in the CUDA in the robot itself. We're literally everywhere. And so the install base says that once you develop the software, once you develop the model, it's going to be useful everywhere. And so the install base is just too incredibly valuable. And then lastly, the fact that we're in every single cloud makes us genuinely unique because you're an AI company and you're an AI developer.
You're not exactly sure which CSP you're going to partner with and where you would like to run it and we'd run it everywhere, including on-prem for you if you like.
“And so, so I think that that the richness of the ecosystem, the expansiveness of the of the of the install base and the versatility of where we are, that combination is makes CUDA invaluable.”
That makes a lot of sense. I guess the thing I'm curious about is whether those advantages matter a lot to your main customers. Like there's many people who they might matter for for the kind of person who can actually build our own software stack, who will make up most of your revenue. Especially if you go to a world where AI is getting especially good at the things which have tight verification loops working are all on them. And then this question of how do you write a kernel that does attention or MLP, the most efficiently across a scale up.
It's a very verifiable sort of feedback loop and so, oh, can everybody can all that hyperscalers write these custom kernels for themselves. And they might still in media, still has great price performance, so they might still prefer to use in media. But then the question is, does it just become a question of who is offering the best specs, the best flops and memory and memory bandwidth.
For a given dollar where historically in media has just had in still has, you...
And the question is, oh, can you sustain those margins?
“If for most of your customers, they can actually afford to build, build instead of the CUDA mode.”
The number of engineers we have assigned to these AI labs isn't saying, working with them optimizing their stack. And the reason for that is because nobody knows our architecture better than we do. And these architectures are not as general purpose as a CPU.
The reason why a CPU is so, you know, a CPU is kind of like a catalog, you know, it's just always, you know, it's a nice cruiser.
And it never goes too fast. Everybody drives it pretty well, you know, it's got cruise control, you know, and everything is easy. But in a lot of ways, Nvidia's GPUs are accelerators are kind of like F1 racers. And yeah, I could imagine everybody's able to drive it at 100 miles an hour. But it takes quite a bit of expertise to be able to push until limit. And we use, we use a ton of AI to create the kernels that we have.
And I'm pretty sure we're going to still be native for quite some time. And so our expertise helps our, our, our AI labs partners get another two X out of their stack easily, oftentimes. So it's not unusual that we, you know, by the time that we're done optimizing their stack or optimizing a particular kernel, their model sped up by three X to X 50%. That's a huge number, especially when you're talking about the install base of the fleet that they have of all the hoppers and black walls that they have.
When you increase it by a factor of two that doubles the revenues that directly translates to revenues. The easiest computing stack is the best performance per TCO in the world, bar none. Nobody can demonstrate to me that any single platform in the world today has better performance TCO ratio, not one company. In fact, the, the benchmarks are out there, Dylan's, right, inference max is sitting out there for everybody to, to use and not one TPE won't come, training won't come.
I encourage them to use inference max and demonstrate their incredible inference cost.
It's really, really hard. But not nobody wants to show up ML per, I would, I would welcome training them to demonstrate their 40% that they claim all the time. I would, I would love to, to hear them demonstrate the, the cost advantage of TPE use. It makes no sense in my mind. It makes absolutely zero sense on first principles and makes no sense.
“And so I, I think the, I think the, the reason why we're so successful is simply because our TCO is so great.”
There's a second, you say, um, 60% of our customers are to top five, but most of that business is external.
For example, most of AWS is most of Nvidia in AWS is for external customers, not internal use. Most of our customers at Azure obviously all of our customers are external, all of our customers at OCI external, not internal use. The reason why the, the favor us is because our reach is so great. We can bring them all of the great customers in the world, they're all built on Nvidia. And the reason why all these companies are built on Nvidia is because our reach and our versatility is so great.
And so, so I think, I think the flywheel is really install base, the program ability of our architecture, the richness of our ecosystem. And the fact that there's so many AI companies in the world, there's tens of thousands of them now.
“And if you were one of those AI startups, what architecture would you choose?”
You would choose an architecture that's most abundant, where the most abundant in the world. The one has the largest install base, where the most largest install base and one that has a rich ecosystem. And so that's the flywheel that that's the reason why between the combination of one, our per dollar is so great. That they have the lowest cost tokens. Second, our per watt is the highest in the world. And so if, if one of these companies, if our partners built a one gigawatt data center, that one gigawatt data center, better deliver the maximum amount of revenues that and number of tokens, which directly translates to revenues.
You wanted to generate as many tokens as possible, maximize the revenues for that data center, where the highest tokens per watt architecture in the world.
Then lastly, if your goal is to rent the infrastructure, we have the most cus...
And so that's the reason why the flywheel works. Interesting. I guess the question comes down to what is the actual market structure here? Because even if there's other companies, there could have been a world where there's tens of thousands of AI companies that have roughly equal share of compute. But if even through these five hyperscalers, really the people on Amazon using the computer and throw up that open AI. And these big big foundation labs who can themselves afford and have the ability to make different accelerators work.
“Oh, I, I think your, your assumption is is premises wrong.”
Maybe. Let me ask you a slightly different question. Come back and make me correct your, your, your premise. Let me just ask you a different question, which is, okay, if everything is still make sure that make me come back and fix because it's just too important to AI. It's too important to the future of science is too important to the future of the industry.
That that premise, the premise, look. Well, let me just first. Okay. All right. Adjusting together.
Yeah. So what do you think, if, if all these things are true about price performance and performance rewards in our true. Why, why do you think it is, the case that say. Anthropics, for example, just announced a couple of days ago. They have a multi-kick a wide deal with broad common Google for TPUs and majority of their computer.
Obviously for Google, it's, TPUs are really computer-toe. But look at these big AI companies. It seems like a lot of their, there are some point where there's all in video. And now it's not. And so I'm curious how to square.
“If these things are true on paper, why are they going with other accelerators?”
Yeah. Anthropics is, isn't, is a unique instance and not a trend. Without Anthropic, why would there be any TPU growth at all? It's 100% Anthropic. Without Anthropic, why would there be any training growth at all?
It's 100% Anthropic. I think that's fairly well known and well understood. It's not that there's an abundance of asic opportunities. There's only one Anthropic. But open the eyes deals with AMD, they're building their own Titanix elevator.
Yeah, but they're mostly, we could all acknowledge their vastly in video. And we're going to still do a lot of work together. Yeah. And we're not, I'm not offended by other people using something else and trying things. If they don't try, these other things, how would they know how good are this?
You know, and sometimes you've got to be reminded of it. And we have to continuously earn the position that we're in.
There are always big claims and look at the number of Asics that have been canceled.
Just because you're going to build an Asic, you still have to build something better than Nvidia. And it's not that easy building something better than Nvidia. It's not sensible, actually. You know, Nvidia's got to be missing something seriously. You know, and because our scale, our velocity, where the only company in the world that's cranking it out,
every single year, big leap, every single year. I guess their logic is that, hey, it doesn't need to be better. It just needs to be not more than 70% worse because they're taking you 70% margins. No, no, no, don't forget. Even in Asics, margins are really quite high.
Nvidia's margin's 70% let's say, but in Asic margin's 65. What are you really saving? Are you mean from Vidcom or something? Yeah, sure. You got to pay somebody.
“And so, so I think the Asic margins are incredibly good from what I can tell.”
And they believe it's so too.
And so, they're quite proud of their incredible Asic margins.
And so, you asked the question, why? A long time ago, we just didn't have the ability to do it. And this is, this is, and at the time, at the time, I didn't deeply internalize how difficult it would be to build a foundation, AI lab like OpenAI and Anthropic.
And the fact that they need a huge investments from the supplier themselves. We just weren't in a position to make the multi-billion-dollar investment into Anthropic. So that they could use our, use our compute. But Google and AWS were. And they put in huge investments in the beginning.
So that Anthropic in return used their compute. We just weren't in a position to do so at the time. Nor did I, I would say, my mistake is I didn't deeply internalize that they, they really had no other options.
That, that a VC would never put in five, ten billion dollars investment
into an AI lab with the, with the hopes of it turning out to be Anthropic.
So that was my miss.
But even if I understood it, I don't think we would have been in a position to do that at the time.
“But I'm not going to make that same mistake again.”
And, and I'm delighted to invest in OpenAI. And, and I'm delighted to, to help them scale.
And I believe it's essential to do so.
And then, and then when I, when I was able to. Anthropic came to us. I'm delighted to be an investor, delighted to help them scale. And, but we just weren't at, at the time, able to do so. If I, if I could rewind everything, and video could have been as big back then as we are now.
I would have been more than happy to do it. And this is, this is actually quite interesting, which is for many years. And video has been this, um, D company in AI, making money, making lots of money. And, um, now you're investing it. It's been reported that you've done up to 30 billion in an OpenAI in 10 billion in, um, and Thropic.
Um, but now their valuations have increased, and I'm sure they'll continue to increase. Um, and so over, overall, these many years, you know, you were giving them the compute. You saw where I was headed. And then they were worth, like, one tenth what they are now, a couple years ago, or even a year ago in some cases. Um, and you had all this cash.
There's, there's a world where either in video, themselves becomes a foundation lab. Um, that does the huge investment to make that possible, or has made the deals you made now at current valuations much earlier on. Um, and you had the cash to do it.
“So I'm, I am curious actually, why not have done it earlier?”
We did it as soon as we could have. We did as soon as we could have. And, um, if I could have, I would have done it even earlier. Um, at the time that Anthropic needed us to do it, we just weren't in a position to do it. It wasn't, you know, it wasn't in our sensibility to do so.
How's it like a cash thing, or just?
Yeah, the level of investment, you know, we never invested outside the company at the time.
And not that much. And, um, and we didn't realize we needed to. You know, I always, I always thought that they could just go raise the VCs for God's sakes. Like, like, all companies do. Um, but, um, what they were trying to, what they were, were trying to do,
uh, couldn't have been done through VCs. What opening I wanted to do, couldn't have been done through VCs. And I recognized that now. I didn't know it then, you know, but that's their genius. That's why they're smart.
You know, and so, so they realized they realized it then that they had to do something like that. And I'm delighted that they did, you know. And even though, even though, um, we, we caused Anthropic to have to go to somebody else. Um, I'm still happy that it happened. Anthropics existence is great for the world.
I'm, I'm delighted for it. Uh, I guess you still are making a ton of money. And we're making way more money. Um, quarter after quarter. It's still okay to have regrets. Um, so the, the question still arises.
Okay. Well, now that we're here, you have all this money that you keep making. Um, what should it really be doing with it? And there's one answer, which says, look, there's this whole middle man ecosystem that is popped up for converting, um, cat backs into objects for these labs so that they can rent compute. Um, because the ships are really expensive.
They make a lot of money over their lifetime through because the AML is getting better. The value of the generate the tokens is increasing, but they're expensive to set up. And video has the money to do the cat backs. So, and in fact, you are, uh, your, it's been reported. Your rack stop the core.
We have up to six point three billion and haven't invested to be.
“Um, but, yeah, why, why isn't an Nvidia become a cloud themselves?”
Why doesn't it come a high-press killer themselves at this computer? How long does cash should do it? This is a philosophy of the company. And, and I think it's wise, we should do as much as needed as little as possible. And, and what that means is the, the work that we do with building our, our computing platform.
If we don't, if we don't do it, I genuinely believe it doesn't get done. If we didn't take the risk that we take, if we didn't build MB link the way we built, if we didn't build the whole stack, if we didn't create the ecosystem the way we did it, if we didn't dedicate ourselves to 20 years of kuda while losing money most of that time, if we didn't do it, nobody else would have done it.
If we didn't create all of the kuda x libraries so that they're all domain specific. You know, this is several decade and a half ago. Now, we pushed into domain specific libraries because we realized that if we didn't create these domain specific libraries, whether it's for ray tracing or image generation or even the early works of AI, these models. If we didn't create them for data processing, structured data processing or vector data processing,
if we didn't create them, nobody would. And I am completely certain of that.
We created a library for computational lithography called kuda though.
If we didn't create it, nobody would have.
“And so, accelerated computing went advance the way it has if we didn't do what we did.”
And so, we should do that. We should dedicate our company, all of our might, wholeheartedly go do that. However, the world has lots of clouds. If I didn't do it, somebody show up. And so, following the recipe, the philosophy of doing as much as needed, but as little as possible, as little as possible,
that philosophy exists in our company today. And everything I do, I do it with that lens. In the case of clouds, if we didn't support CoreWave to exist, these Neo Clouds, these AI Clouds, when exist. If we didn't help CoreWave exist, they would not exist. If we didn't support N-scale, they wouldn't be where they are today.
If we didn't support Nebias, they wouldn't be where they are today. Now, they are doing fantastically. Is that a business model where no? We should do as much as needed as little as possible. And so, we invest in our ecosystem, because I want our ecosystem to thrive.
And I want the architecture, and I want AI to be able to connect with as many, as many industries as possible, as many countries as possible. And make it possible for the planet to be built on AI, and to be built on the American Tech Stack.
“And so, that vision, I think, is exactly what we're pursuing.”
Now, one of the things that you mentioned, there are so many great amazing foundation model companies,
and we try to invest in all of them. And this is another thing that we do. We don't pick winners. And we like, we need to support everyone, and it's part of our part of our joy of doing so. It's imperative to our business.
But we also go out of our way not to pick winners. And so, when I invest in one of them, I invest in all of them. Why do you go out of your way to not to pick winners? Because it's not our job, too. Number one.
Number two, when NVIDIAs first started, there were 60 graphics companies, 63 graphics companies. We are the only one that survived. If you were to take in those 60 companies, 60 graphics companies and ask yourself, which one was going to make it? NVIDIA would be the top of that list, not to make it. You know, this is long before you.
But NVIDIA's graphics architecture was precisely wrong. It's not a little bit wrong. We created an architecture of those precisely wrong. And it was an impossible thing for developers to support.
It was never going to make it.
We reasoned about it for good for good for first principles, but we ended up in the wrong solution. And everybody would have kind of, everybody would have counted us out. And here we are. And so, I'm, I have enough humility to recognize that, you know, don't, don't pick winners. Yeah.
Either let them all take care of themselves or take care of all of them. And one thing I didn't understand is, you said, look, we're not prioritizing these new clouds. Just because there are new clouds and we want to prop them up. But you also said, you listed a bunch of new clouds and you said, they wouldn't exist if it wasn't for NVIDIA. Yeah.
And so, how are those two things compatible? First of all, they, they need to want to exist. And they come to ask us for help. And when they, when they want to exist and have a, they have a business plan. And they, you know, they have expertise and, you know, they have the passion for it.
They obviously have to have some capability, some selves. But if, at the end of the day, they need some investment in order to get it off the ground, we would be there for them.
“But, but the sooner they get their flywheel going, you know, your question was, do we want to be in the financing business, the answers now?”
Yeah. We don't want to be, we want to, because there are people in the financing business. And we rather work with all of the people who are in financing business, then to be a finance here ourselves. So, so I think the, the, our goal is to focus on what we do. Keep our business model as simple as possible.
Support our ecosystem.
When someone like, like, open AI needs an investment of 30 billion dollar scale.
Because it's still before their IPO. And, and we deeply believe in them. We deeply believe that, I deeply believe that that they're going to be, they're going to be in, well, they're an extraordinary company already today. They're going to be an incredible company.
And, you know, the world needs them to exist. The world wants them to exist. I want them to exist. And, and, and they have everything, they have the wind at the back. Let's, let's support them and let them scale.
And so, so, to those, those investments will do, because we're, they need us to do it. And, um, but we're, we're not trying to do as much as possible.
We're trying to do as little as possible.
I've been away too much time. Copy pasting tax back and forth for a Google Docs to chat about.
And so, I built what's basically a cursor for writing,
which operates the way I think an AI co-researcher should operate. I can tag it and it can talk with me through in-line comment threads and help me dig deeper in brainstorm. I built this entire thing over the weekend with cursor and their new composer to a model.
With a lot of agenda coding tools, I feel like I have no idea what's going on under the surface. I just have to relinquish control and hope for the best. But cursor, let me try a bunch of different ideas while staying on top of the implementation. I did most of my brainstorming in the agents window. And after I got some basic files in place, I used a diff window to track changes.
The few times that I needed to make a quick tweak by hand, I just used the editor.
“If you want to try my AI co-researcher yourself,”
I've linked the GitHub repo in the description. If you have a tool that you've been wanting to build, you should make it happen. Go to cursor.com/tourcash to get started. This might be sort of an obvious question, but we've lived many years in this situation where there's a shortage of GPUs.
And it's grown now because models are getting better. We have a shortage of GPUs. And Nvidia is known for giving up the scarce allocation. Not just based on high spitter, but rather on, "Hey, we want to make sure that these new clouds exist.
Let's give some to Crusoe. Let's give some to Lambda." Why is it good for Nvidia?
First of all, would you agree with the characterization of the fact?
No, no. Your premise is just wrong. We're sufficiently mindful about these things. We're very mindful about these things. First of all, if you don't place a PO,
all the talking in the world won't make a difference. And so until we get a PO, what are we going to do? And so the first thing is we work really hard with everybody to get a forecast done. Because these things take a long time to build.
And the data centers take a long time to build. And so we align ourselves with demand and supply. And things like that through forecasting. Okay, that's job number one. Number two, everybody who, you know, we've tried to forecast with as many people as possible.
But in the final analysis, you still have the place in order.
“And maybe for whatever reason you didn't place your order, what can I do?”
And so at some point first and first out. But beyond that, if you're not ready, because your data center's not ready, or serving components aren't ready to enable you to stand up a data center, we might decide to serve another customer first. That's just maximizing the throughput of our own factory.
And so we might do some adjustments there. Aside from that, the prioritization is first and first out. Yeah, you got to place a PO. Maybe I'll place a PO. Now, of course, there are stories about that.
You know, like, for example, all of this kind of started from from a, it was an article about Larry and Elon having dinner with me with a, where they begged for GPUs. That never happened. We, we absolutely had dinner.
And we absolutely had dinner. And it was, it was a wonderful dinner. And no time did they begged for GPUs. And so they just had the place in order. And once they placed an order, we do our best to get the capacity to them.
Yeah. We're not complicated. Okay, so it sounds like there's a queue. And then based on whether your data center is ready, and when you place a purchase order, you get them a certain time.
But it still doesn't sound like high spitter or just gets it. Is there a reason to do it? We never do that. Okay. We never do it.
I just do high spitter. Because it's a bad business practice. You, you set your price. You set your price. And then, and then people decide to buy it or not.
And, and, um, they're, they're, they're, I understand that that others in the chip industry, um, changed their prices when demand is higher. But we just don't. We just don't.
That's just never been a practice of ours.
You can count on us. You know, I prefer it to be, um, uh, dependable, uh, uh, to be the foundation of the industry. And I, you don't need, you don't need the second gas. You know, if, if you, if I quoted you a price, um,
we quoted you a price. That's it. And if demand goes through the roof, so be it.
“And on the other end, that's why you're a productive relationship”
as you can see, right? Yeah. Yeah. Yeah. Uh, and video has been in business. And we've been doing business with them for, uh, uh, I guess coming up on 30 years. And, and video and TSMC don't have a legal contract.
There's, there is always some rough justice. And, um, sometimes I'm right, sometimes I'm wrong.
Uh, sometimes I got, I got a better deal, sometimes I got a worse deal.
Uh, but overall in the, in the whole, the relationship is incredible.
And, and I can completely trust them, like completely depend on them. And, and art, one of the things that we, you can count on with Nvidia is that next year, this year, very Rubin's going to be incredible. Next year, very Rubin ultra will come. They year after that, fine, and we'll come.
And they year after that, I haven't introduced the name yet. And so, so every single year you can count on us. And this is, and you, you, you're going to have to go find another ace of team in the world. Pick your ace of team. Where you can say, I can bet the farm of, I can bet my entire business
that you will be here for me every single year. Your cost, your token cost, will decrease by an order of magnitude every single year. I can count on it like in count on the clock. Well, I just said something about TSMC.
“No other foundry in history, can you possibly say that?”
You can say that about Nvidia today. You can count on us every single year.
If you would like to buy a billion dollars with an AI factory compute,
no problem. If you like to buy a hundred million dollars, no problem. If you like to buy ten million dollars, or just one rack, not a problem, or just one graphics card. Okay, no problem. If you would like to place an order for a hundred billion dollar AI factory, no problem. Where the only company in the world where you can say that today,
I can say that about TSMC as well. I want to buy one, buy one billion, no problem. We just got to go through the process of planning for it and, you know, all the, all the things that mature people do, you know, and so.
“So I think the, this ability for Nvidia to be the foundation of the world's AI industry.”
This is a, this is a position that has taken us decade, several decades to arrive at enormous commitment and enormous dedication. And the stability of our company, the consistency of our company is really, really important. Okay, I want to ask about China.
Yeah, and I always like to take, I don't actually don't know what I think about,
whether it's good to sell to China or not, but I, like, play devils out against my guess. So when Dario is on who supports tax work control, that's him. Well, why can't America and China both have country of geniuses in the data center? But since during the opposite side, I'll ask you in the opposite way. And look, what I'm going to think of as anthropic actually announced a couple days ago,
mythos preview, this model myth was not even releasing publicly because they say it has such cyber offensive capabilities that we don't think the world is ready until we get, we make sure these zero days are passed up. But they say it found thousands of high severity vulnerabilities across every major operating system, every browser, it found one in open BSD, which is this operating system that is specifically designed to not have zero days and have found one for 27 years that's existed.
And so if Chinese companies and Chinese labs and Chinese government had access to the AI chips to train a model like Claude Mythos with these cyber offensive capabilities and run millions of instances of it with more compute. The question is, oh, is that a threat to American companies to American national security? First of all, mythos was a trained on fairly mundane capacity and a fairly mundane amount of it by an extraordinary company. And so the amount of capacity in the type of compute that's, it was trained on is abundantly available in China.
And so you just have to first realize that chips exist in China. They manufacture 60% of the world's mainstream ships, maybe more. It's a very large industry for them. They have some of the world's greatest computer scientists, as you know, most of the AI researchers and all of these AI labs, most of them are Chinese.
They have 50% of the world's AI researchers. And so the question is, if you're concerned about them, what is the considering all the assets they already have? They have an abundance of energy.
“They have plenty of chips. They've got most of the AI researchers. If you're worried about them, what is the best way to create a safe world?”
Well, victimizing them, turning them into an enemy likely isn't the best answer. They are an adversary. We want United States to win. But I think having a dialogue and having research dialogue is probably the safest thing to do. This is an area that is glaringly missing because of our current attitude about China's an adversary.
It is essential that our AI researchers and their AI researchers are actually...
It is essential that we try to both agree on how to what not to use the AI for.
“With respect to finding bugs in software, of course, that's what AI is supposed to do.”
It's going to find bugs in a lot of software. Of course, there's lots and lots of bugs in the AI software. And so that's what AI is supposed to do. And I'm delighted that AI has reached the level where it could help us be so much more productive. One of the things that is under-emphasized is the richness of ecosystem around cybersecurity, AI cybersecurity and AI security and AI privacy and AI safety. That whole ecosystem of AI startups that are trying to create this future for us,
where you have one AI agent that's incredible, surrounded by thousands of AI agents keeping it safe, keeping it secure. That future surely is going to happen. And the idea that you're going to have an AI agent running around with nobody watching after it is kind of insane. And so we know very well that this ecosystem needs to thrive. It turns out this ecosystem needs open source.
“This ecosystem needs open models. They need open stacks.”
So that all of these AI research and all these great computer scientists can go build AI systems that are as formidable and can keep AI safe. And so one of the things that we need to make sure that we do is we keep the open source ecosystem vibrant and that can't be ignored. And a lot of that is coming out of China. We had to not suffocate that. With respect to China, we want to have, of course, one United States staff is much computing as possible.
We're limited by energy, but we got a lot of people working on that. And we had to not make energy a bottleneck for our country. But what we also want is we want to make sure that all the AI developers in the world are developing on the American Tech Stack and making the contributions, the advancements of AI, especially when it's open source available to the American ecosystem. And it would be extremely foolish to create two ecosystems.
The open source ecosystem, and it only runs on the Chinese Tech Stack, a foreign tech stack, and a closed ecosystem. And that runs on the American Tech Stack.
“I think that would be a horrible outcome for the United States.”
Since there are a lot of things, let me just triage the response. I mean, I think the concern going back to the flop difference in the hacking is, yes, they have compute. But there's some estimates that because they're at seven nanometer, they don't have U.V. Because of chip making expert controls, the amount of flops they're about to actually produce. They have like one tenth amount of flops that the U.S. has.
And so with that, could they train eventually a model like mythos? Yes. But the question is, because we have more flops, American labs are able to get to these level capabilities first. And because inthropy got to a first, they say, okay, we're going to hold on to a four month. While all these American companies, we give them access to it.
They're going to patch up all their vulnerabilities, and now we release it. Furthermore, if they, even if they train a model like this, the ability to deploy that scale, you know,
if you had a cyber hacker, it's much more dangerous if they have a million of them versus a thousand of them.
So that inference compute really matters a lot. And in fact, the fact that they have so many researchers are so good is the thing that makes it so scary, because what is it that makes those engineer researchers more productive is compute. If you talk to any, I love it America. They say the thing that's bottleneck in the miscompute.
So, and there are quotes from Deepsie founder or Quein leadership or whatever. They say like the thing we're bottleneck on is compute. So then the question is, isn't it better that we get to get American companies because they have more compute, get to get to the level of spot or mythos, level capabilities first. Prepare our society for it before trying to get to it because they have less compute.
We should always be first and we should always have more.
But in order for that outcome for you to, what you described to be true, you have to take it to the extremes. They have to have no compute. And if they have some compute, the question is how much it is needed. The amount of compute they have in China is enormous.
I mean, you're talking about a country, there's a second largest computing ma...
If they wanted to deploy aggregate, they're compute, they got plenty of compute to aggregate.
But is that true? I mean, there's people do these estimates and they're like, but this make it's actually behind on the process notes or they're at the edge. How about the tell you? Okay.
“The amount of energy they have is incredible, isn't that right?”
AI's a parallel computing problem, isn't it? Why can they use put four, ten times as much chips together because energy is free? They have so much energy, they have data centers that are sitting completely empty, fully powered. They have ghost cities, they have ghost data centers. They have so much capacity of infrastructure.
If they wanted to, they're just gang up more chips, even if they're seven nanometer. And their capacity of building chips is one of the largest in the world. The semiconductor industry knows that they monopolize mainstream chips.
They have over capacity, they have too much capacity.
And so the idea that China won't be able to have AI chips is completely nonsense. Now, of course, if you ask me, would, would, would, would, would, would, United States be, be further ahead of if the entire world had no compute at all. But that's just not an outcome. That's not a scenario that's true. They have plenty of compute already.
The amount of threshold they need for the, for the concern you're worried about. They've already reached that threshold and beyond.
“And so, so I think the, you missed understand that AI is a five layer cake.”
And at the lowest layer layer is energy. When you have abundant of energy, it makes up for chips. If you have abundance of chips, it makes up for energy. For example, United States is scarce on energy. Which is the reason why Nvidia has to keep advancing our architecture
and do this extreme code design so that with the few chips that we ship. Okay, with the few chips, because the amount of energy so limited, our throughput per watt is off the charts. But if you're, amount of watts is completely abundant and it's free. What do you care about performance for a watt for?
You get plenty, you can use all chips to do so. So seven nanometer chips are essentially hopper. The ability to, for hopper. I got to tell you today's models are largely trained on hopper. Yeah, hopper generation.
And so, so hopper, seven nanometer chips are plenty good. The abundance of energy is their advantage. But then there's a question of, okay, well, can they actually manufacture the enough chips given there? But they do. What's, what's the evidence?
Huawei just had the largest single year in the history of the company. How many chips did they ship? A ton, millions. Millions is way more, way more than anthropic house. So, there's a question of how much logic is making chef.
Then there's a question of how much memory. I'm telling you what it is. They have plenty of, they have plenty of logic and they plenty of HBM2 memory. Right, but as you know, the bottleneck often in training and doing in front of these models is the amount of bandwidth.
If you HBM2, I don't have a numbers off hand, but like versus the newest thing you have, you know, you can be almost an order of magnitude difference in memory bandwidth, which is. Huawei's a networking company. Huawei's a networking company.
But that doesn't change the fact that you need a UV for the most advanced STHBM. Not true, not at all true. You could gang them together, just like we gang them together with MBLink 72.
They've already demonstrated silicon photonics, so connecting all of these compute together into one giant supercomputer. Your premise is just wrong. The fact that it matters, their AI development is going just fine. And the best AI researchers in the world, because they are limited in compute,
they also come up with extremely smart algorithms.
“Remember, I just, what I said, I said that Moore's law is advancing”
about 25% per year. However, through great computer science, we could still improve algorithm performance by 10x. What I'm saying is great computer science is where the lever is. There is no question.
MOU is a great invention.
There's no question all the incredible attention mechanisms
reduce the amount of compute. We have got to acknowledge that most of the advanced advances in AI came out of algorithm advances, not just the raw hardware. Now, if most of the advances came from algorithms and computer science and programming, tell me that their army of AI researchers is not their fundamental advantage.
And we see it. Deep sea is not in consequential advance.
The day that deep sea comes out on Huawei first,
that is a horrible outcome for our nation.
Why is that? Because, I mean, currently, you can have a model like deep sea. Because deep sea, I can run on any accelerator, if it's open source. Why would that stop being the case in the future? Well, suppose it doesn't, suppose it optimized for Huawei, suppose it's optimized for their architecture.
It would put ours at a disadvantage. You described the situation that I perceived to be good news, that a company developed software developed an AI model, and it runs best on the American tech stack. I saw that as good news. You set it up as a premise that it was bad news.
I'm going to give you the bad news, that AI models around the world are developed and they run best on not American hardware. That is bad news for us. I guess I just don't see the evidence that there's these huge disparities that would prevent you from switching accelerators.
American labs, you know, are running their models across all the clouds across all the different areas. You take a model that's optimized for Nvidia and you try to run something else. But the American labs do that. And they don't run better. And video success is perfect evidence.
The fact that AI models are created on our stack runs best on our stack. How is that illogical to understand? I'm just looking, look, anthropics models are run on GPUs. They're run on training on the run on GPUs. A lot of work has to go into it to change.
But go to the global south, go to the Middle East, coming out of the box. If all of the AI models run best on somebody else's tech stack, you've got to be arguing some ridiculous claim right now that that's a good thing for our United States. But I guess I don't understand arguments.
If Chinese companies get to the next mythos first.
They find that all this security money really is an American software first. But they can do that in video hardware and they ship it to the global south. They doesn't have any video hardware. How is that good? I mean, I just, okay, the runs are really good.
It's not good. It's not good. It's not good. So let's not let it happen. Why do you think it's perfectly fungible?
“That if you didn't ship them computer would exactly be replaced by Huawei?”
They are behind, right? They have worse shifts than you. It's completely, there's evidence right now. They're chip industry's gigantic. You can just look at the flop or bandwidth or memory comparisons
between the H200 and the Huawei's 910C. It's a graph. They use more of it. They use twice as many. I guess it seems like argument is they have all this energy
that's ready to go. Right. And they need to fill in with chips. And they're good in manufacturing. And I'm sure eventually they would be able to just,
oh, I would manufacture everybody.
But there's this few critical years.
What, what is the critical year you're talking about? These next few years. The next critical years is critical. Then we have to make sure that all of the world's AI models are built on American TechStack.
These critical years. Okay.
“How would that prevent, if they're built on American TechStack?”
How would that prevent them from if they have more advanced capabilities from launching the mythos equivalent cyber attacks? There's no guarantee either way. But if you have it early, we're going to prepare for it. Listen.
Why are you, why are you causing one layer of energy? Why are you causing one layer of the AI industry? To lose an entire market. So that you could benefit another layer of the AI industry. There's five layers.
And every single layer has to succeed. The layer that has to succeed most is actually the AI applications. Why are you so fixated on that AI model? That one company for what reason? Because those models make possible.
These incredibly offensive capabilities. And you need computer energy. The chips. The ecosystem of AI researchers make it possible. A few months ago, Jane Streets spent about 20,000 GPU hours trading back doors
into three different language models. Then they challenging my audience to find the trigger phrases. I just kind of with the work said, who designed the puzzle about some of the solutions that Jane Streets received. If you think the base model is here and the back word model is here,
you can kind of linearally interpolate the weights to adjust the strength of the back door. But you can also extrapolate it to make the back door even stronger. And in some cases, if you make it strong enough, the model will just recurge to take what the response phrase was supposed to be. So if you keep amplifying the difference between the base version and the back door version,
eventually it should spread out the trigger phrase. But this technique only worked on two out of three models. Even Rickston isn't sure why he didn't work on the other.
“They're able to verify that a model only does what you think it does is one of the most important up-and-questions in AI security.”
If this is the kind of problem that excites you, Jane Streets is hiring researchers and engineers. Go to JaneStreet.com/torcaesh to learn more. Okay, stepping back, it has to be the case that Cheyenne is able to build enough seven nanometer capacity.
And remember, there's still stuck on seven nanometer. Well, you'll move on to three nanometer in the two nanometer, or one point six nanometer with Feynman. So while you're on one point six nanometer, they're still going to be on seven nanometer.
And they have to produce enough of it to make up for the shortfall. And they have so much energy that the more chips you give them,
The more compute they'd have.
Right? So there's, it comes out to the question of,
ultimately, they are getting more compute.
Computers are in an important training in France. I just think you speak in absolute.
“I think that United States ought to be ahead.”
The amount of compute in United States is a hundred times more than anywhere else in the world. The United States ought to be ahead. Okay. The United States is ahead. And video builds the most advanced technologies. We make sure that the U.S. labs are the first to hear about it in the first chance to buy it.
And if they don't have enough money, we even invest in them. The United States ought to be ahead. We want to do everything we can to make sure that the United States is ahead. Number one point. Do you agree?
And we're doing everything we can to do that.
But how is shipping chips to Cheyenne? I'm not keeping you. No, no, no, no, no, no, no, no, no, no. We've got very, we've got very Rubin for United States. We have very Rubin for United States.
Now, United States.
“Am I in United States? Do you consider me part of the United States?”
Yes. In video. You consider a United States company. Okay. Number one. Why is it that we don't come up with a regulation?
That's more balanced. So that Nvidia can win around the world instead of giving up the world. Why would you want United States to give up the world? The chip industry is part of the American ecosystem. It's part of American technology leadership.
It's part of the AI ecosystem. It's part of AI leadership. Why? Why is it that your policy, your philosophy leads to United States giving up a vast part of the world's market? The claim you're as afraid, Darryu had this quote where he said, it's like bowing, bragging that we're selling North Korea new, but the missile casings are made by bowing.
And that's somehow enabling the US technology stack. Like, fundamentally, you're giving them this giving me the-- I'm very AI to anything that you just mentioned is lunacy. But AI similar to Enrich uranium, right? And then it can have positive uses. You know, negative uses.
Be still, don't want to send Enrich uranium to other countries. Who's sending Enrich-- The analogy or is Enrich uranium as a-- It's a lousy analogy. It's a logical analogy. But if that compute can run a model, that can do zero-day exploits against all Americans offer.
How is that not a weapon? First of all, the way to solve that problem is to have dialogues with the researchers, the dialogues with China and dialogues with all the countries, to make sure that people don't use technology in that way. That's a dialogue that has to happen. Okay? Number one, number two.
We also need to make sure that United States is ahead. Everything--Rubin, Vera Rubin, Blackwell, is available in United States in abundance. Mounds of it, obviously, are--are results with show it. A abundance--a tons of it, tons of it. The amount of computing we have is great.
We have amazing AI researchers here. It's great.
We ought to stay ahead. However, we also have to recognize that AI is not just a model. That AI is a five-year-old cake that AI industry matters across every single layer. And we want United States to win at every single layer, including the chip layer. And conceding the entire market is not going to allow United States to win the technology race long-term in the chip layer in the computing stack.
That is just a fact. I guess then the crux comes down to how does selling them chips now help us win in the long-term. Like Tesla sold extremely good electric vehicles to China for a long time. iPhone's are sold in China, extremely good. They didn't cost them lock-in, China will still make their version of EVs and they're dominating in smartphones.
When you started to conversation today, you would acknowledge and you acknowledged that Nvidia's position is very different.
“Use words like "mote". The single most important thing to our company is our richness of our ecosystem, which is about developers.”
Fifty percent of the AI developers in China. We don't want to wish the United States should not give that up. We have a lot of Nvidia developers in the US, and that doesn't prevent American laughs from also being able to use other accelerators in the future. In fact, right now they're using other accelerators as well, which is fine and great. I don't see why that wouldn't be the case in China as well. If you sell them in video trips, just the same way that Google can use TPUs and Nvidia.
We have to keep innovating, as you probably know, our share is growing, not decreasing. The premise that even if we compete in China, that we're going to lose that market anyways, I don't, you're not talking to somebody who woke up a loser. And that loser attitude that loser premise makes no sense to me. We are not a car. We are not a car.
The fact that I can buy a car this car brand one day and use another car bran...
Computing is not like that. There's a reason why the X86 still exists. There's a reason why Arm is so sticky.
These ecosystems, these ecosystem are hard to replace. It costs an enormous amount of time and energy and most people don't want to do it. And so it's, it's our job to continue to nurture that ecosystem to keep advancing the technology so that we could compete in the marketplace. Based on the premise you described, I simply can acknowledge that. It makes no sense, because I don't think United States is a loser. And that that losing proposition that losing mindset makes no sense to me. I'll move on. I just want to make sure you don't have to move on. I'm enjoying it.
“I think the circle is gonna make us that night think it helps bring out with the crux here. The crux is you're going to extremes. You're argument starts from extremes that if we give them any compute at all.”
In this narrow moment, we will lose everything. No, I think what my argument those extremes, they're, they're childish. The idea is not that there is some key threshold of compute is that any marginal compute is helpful. You can train a better model. And I just want you to acknowledge that any marginal sales for American technology industry is beneficial. Actually, I mean, if the AI models that run on those chefs, are capable of cyber offensive capabilities.
We're trading models, they're capable of cyber defense, running more models of those instance. It is not a nuclear weapon, but it is, it enables a weapon of a kind. The logic that you use, you might as well say it to micro processors and DRAMs. You might as well say it to electricity, but in fact, we do have actual controls on the technology that is relevant to making the most advanced DRAM. Right? We have all kinds of expert controls on China for all kinds of treatments.
We saw a lot of DRAM and CPUs into China, and I think it's right. I guess this is back to the fundamental question of it is AI different. Right? If you have the kind of technology, they can find these zero days in software.
Is that something where we want to minimize China's ability to get their first--
To get their first-- We want to do not want them to be ahead.
“We can control that. How do we control that if the chips are already there, and they're using that to train that model?”
We have tons of compute, we have tons of AI researchers, we're racing as fast as we can. Again, we have more nuclear weapons than anybody else, but we don't want to send in rich and uranium anywhere. We're not enriched uranium. It's a chip, and it's a chip that they can make themselves. But there's a reason they're buying it from you, right?
And we have quotes from the founders of Chinese companies that say they were bought on that. Because our chips are better. On balance, our chips are better. There's just no question about it. In the absence of our chip, in the absence of our chip, can you acknowledge that while we had a record year? Can you acknowledge that the whole bunch of chip companies have become public?
Can you acknowledge that? Can you acknowledge that? Can you can also acknowledge that the fact that we used to have a very large share in that market? And we no longer have the large share in that market. And I also acknowledge that China is about 40% of the world's technology industry. That market to lead that market, can see that market for a United States technology industry is a disservice to our country.
It is a disservice to our national security. It is a disservice to our technology leadership. All for the benefit, all for the benefit of one company. It makes no sense to me. I guess I'm confused if you're making two different statements.
One is that we're going to win this competition with Huawei because our chips are going to be way better if we're allowed to compete. And another is that they would be doing the same exact thing without us anyways.
“Right, how can those two things be the same for the same time?”
It's obviously true. In the absence of a better choice, you'll take the only choice you have. How is that illogical? But so much reason they want, in many chips, is they're better. Better is more compute.
More compute means you can choose better. It's better because it's easier to program. We have a better ecosystem. But whatever the better is, whatever the better is. And of course, we're going to send them compute.
So what? So what? The fact that a matter is that you would get the benefit. Don't forget, we get the benefit of American technology leadership. We get the benefit of developers working on the American tech stack.
We get the benefit as those AI models diffuse out into the rest of the world. The American tech stack is therefore the best for it. We can continue to advance and diffuse American technology. That I believe is a positive. It's a very important part of American technology leadership.
Now, the policies that you're advocating resulted in the American telecommunication industry
being policy out of basically the world.
To the point where we don't control our own telecommunications anymore, I don't see that as smart.
It's a little narrow-minded and let to unintend the consequences that I'm des...
You seem to have a very hard time understanding. Okay. Let's just a back. It seems like the crux here is there's a potential benefit. And there's a potential cost.
And we're trying to figure out is the benefit worth the cost. I guess I'm trying to get you to acknowledge the potential cost. The compute is an input to training powerful models. Powerful models do have powerful, you know, offensive capabilities like cyber attacks. It is a good thing that American companies got to cloud mythos level capabilities first.
And then now they're going to hold off on the scalability so that the American companies and American government can make their software more protected before this little capabilities announced. If China had had more computer, I've had more car compute. We've had made a mythos level model earlier and deployed it widely. That would have been very bad.
One of the reasons that hasn't happened is that we have more compute things to companies like in India, in America. That is a cost of sending to China. And so let's see if the benefit is set for second. Do you acknowledge that this is a potential cost?
I will also tell you the potential cost is we allow one of the most important layers of the AI stack,
the chip layer to concede an entire market. The second largest market in the world so that they could develop scale. So that they could develop their own ecosystem. So that future AI models are optimized in a very different way than the American tech stack. As AI diffuses out into the rest of the world.
Their standards, their tech stack will become superior to our, because their models are open. I guess I just believe enough in Nvidia's kernel engineers.
“And who do engineers to think that they could optimize more than kernel optimizations, you know?”
Of course. But there's so many things you can do from distilling to a model that's well fit for your chip. We're going to do our best. You have all this offer. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best.
We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best.
We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best.
We're going to do our best.
We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best.
We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best.
We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best. We're going to do our best.
“Now, is there 10x difference between 5 nanometer and 7 nanometer?”
The answer is no. Architecture matters. Networking matters. That's why we're going to do our best. Networking matters.
Energy matters. And so all of that stuff matters. It's not simplistic like the way you're trying to steal it. We can move on from China. But it actually raises an interesting question about
we were discussing earlier at these bottlenecks at TSMC and memory and so forth. And so if we're in this world where you're already in majority of N3 at some point, you'll be in two. You'll be in majority of that. Do you see that you could go back to N7,
the spare capacity at an older process node and say, hey, the demand for AI is so great. And our capacity to expand the leading edge is not meeting it. So we're going to make a hopper or ampere about everything we know about in Numerics today and all the other improvements you described.
Do you see that world happening within before 2030?
It's not necessary to.
“And the reason for that is because with every, every generation,”
the architecture, the architecture is more than just, is more than just the transistor scale. It also, you're doing so much engineering and packaging and stacking and the numerics and, you know, the system architecture. When you run out of capacity to easily go back to another node,
that's a level of R&D that no one could afford. You know, we could afford to lean forward. I don't think we could afford to go back. The world simply says, if on that day, if on that day, let's do the thought experiment.
On that day, we go, listen, we're just never going to have more capacity ever again.
Would I go back and use cement in a heartbeat? Yeah, of course I would. Um, one question somebody I was talking to had is, why in video doesn't run multiple different chip projects. At the same time with totally different architectures.
You could do like a three-brist style way for scale. You could do a dojo style huge package. You could do one without CUDA, you know, you have the resources of the engineering talent to do all of these in parallel. So I put all the eggs in one basket given who knows where AI might go and architectures might go.
Oh, we could. It's just that that we don't have a better idea. Yeah, yeah, we could do all of those things. Um, it's just not better. And we simulate it all.
They're in our simulator, probably worse.
And so we went to it. Yeah, we're doing, we're working on exactly the projects that we want to work on. And, um, if the workload were to change dramatically. And I don't mean, I don't mean the algorithms. I actually mean the workload.
And that depends on the shape of the market. Um, we may decide to add other accelerators. Like for example, recently we added a grock. And we're going to fold grock into our CUDA ecosystem. And, um, we're doing that now because the value of tokens have gone up so high.
That that you could have different pricing of tokens back in the old days. And, you know, just a couple of years ago.
“Tokens are either free or barely, you know, barely expensive, right?”
And so, but now you can have different customers. And those customers want different answers. And so, because the customers makes so much money like for example, our software engineers. If I can give them much more, um, responsive tokens, so that they're even more productive than they are today.
I would pay for it. But that market is only recently emerged. And so, I think that we now have, we now have the ability to have the same model. Based on the response time, have different segments. And that's the reason why we decided to expand the Pareto frontier.
And, and create a segment of inference that is faster response time, even though it's lower, lower throughput.
At the, until now, higher throughput is always better.
We think that there could be a world where there could be very high ASP tokens. And, and, um, even though the, even though the throughput is lower in the factory, the ASP's make up for it. Yeah, that's the reason why we did it.
“But otherwise, from our architecture perspective, um, I, I think Nvidia's architecture,”
I would, I would rather put, if I, if I have more money, I put more behind the architecture. Hmm. I, I think this idea of extremely premium tokens, and just the disaggregation of the inference market, is very interesting. The segmentation.
Yeah. Yeah. Yeah. I find a question, um, supposed to deep learning a revolution didn't happen.
Um, what would Nvidia be doing? Obviously game is, but given. Exhausted computing. Hmm. Exhausted computing.
The same thing we've been doing along. I, the, the premise of our company is that Moore's law, Moore's law is going to, more general person computing is good for a lot of things. But for a lot of computation, it's not ideal. And so we combined an architecture called the GPU CUDA to a CPU,
so that we can accelerate the workload of the CPU. And so different, different kernels of code or algorithms could be offloaded onto our GPU. And as a result, you speed up an, an application by, you know, 100x200x.
And where can you use that? Um, well, obviously engineering and science and physics and, you know, so also data processing, um, computer graphics image generation. I mean, all kinds of things, even if AI doesn't exist today, Nvidia will be very, very large.
Yeah. And so, so I think the, the reason for that is, is fairly fundamental,
Which is, which is the ability for general purpose computing to continue to s...
has largely run its course.
“And the only, the, not the only way, but the way to do that is through”
domain-specific acceleration. And one of the, the domain that we started with was computer graphics. But, um, many, there are many, many other domains. I mean, there's, you know, you know, all kinds of, uh, scientific particle physics and fluids.
And, you know, and, and so, structure data processing, all kinds of different types of, of algorithms that benefit from CUDA.
“And so, our, our mission was, uh, really to bring accelerated computing to the world”
and advance the type of applications that general purpose computing can't do.
And scale to the level of, uh, capability that helps breakthrough certain fields of science.
And, and so some of the early applications were molecular dynamics, uh, seismic processing for energy discovery. Um, uh, image processing, of course. Uh, and so all of those kind of fields where general purpose computing is simply too inefficient to do so. And so, yeah, if, if there's no AI, I would be very sad.
Um, but because of, because of, of the advances that we made in computing, we democratized deep learning. We made a possible for any researcher, any scientist, anywhere, any student to be able to access a PC or, you know, uh, a G4's adding card.
And, and, uh, do amazing science.
And, um, that, that fundamental promise, uh, hasn't changed, not even a little bit. And so, if you, if you see, if you watch GTC, there's the whole beginning part of it. None of it's AI. That whole part of it with, uh, with a computational lithography or, uh, our quantum chemistry work or, you know, uh, all of that stuff data processing work.
Uh, all of that stuff is, is, uh, unrelated to AI. And, and, and it's still very important. I mean, there's, you know, I know that that AI is, is very interesting and quite exciting. Um, but, um, uh, uh, there's a lot of people doing a lot of very important work. That's not, not AI related.
“And, tensor is just not the only way that you compute with.”
And, um, uh, and we want to help everybody. Doesn't? Thank you so much. You're welcome. I enjoyed it.
Thank you.


