Okay, today I'm chatting with Terrence Tao, who needs an introduction.
Terrence, I want to begin by having you retell the story of how Kepler discovered the laws of planetary motion.
“Because I think this will be a great jumping off point to talk about AI for math.”
Okay, yeah, so I've always had an amateur interest in astronomy, and so I've loved stories of how the earliest astronomers worked out the nature of the universe.
So Kepler was building on the work of Copernicus, who was himself building on the work of Aristarchus. Copernicus very famously proposed the Haley of Centric model that instead of the planets and the sun going around the Earth, that the sun was at the center of the solar system and the other planets were going around the sun. And Copernicus proposed that the orbits of the planets were perfect circles. And his theory kind of fit the observations that the Greeks and the Arabs and the Indians had worked out over all the centuries.
I think Kepler got interested, like he learned about these theories in his studies. And he made this observation that the ratios of the size of the orbits that Copernicus predicted seem to have some geometric meaning. I think he started proposing that, you know, if you take, say, the orbit of the Earth, and you enclose it, and I think maybe a cube. The outer sphere of that that encloses the cube, almost match perfectly the orbit of Mars and so forth.
“And there were six planets known at the time, five gaps between them, and there were five perfect glyphonic solids.”
They were the cube, the tetrahesian, the isocrystian octatian, and dodecatian. He had this theory, which he thought was absolutely beautiful, that he couldn't describe these plethoraing solids between the spheres of the planets. And it seemed to fit, and it seemed to him like, you know, God's design of the planets was matching this mathematical perfection of the plethoraing solids. So he needed a data to confirm this theory. So the time there was only one really high-quality data set, it could almost exist in existence, which was the, so Tycho Brahi, the Danish astronomer, very wealthy eccentric astronomer, had managed to convince the Danish government to fund this extremely expensive observatory, this infected entire island.
Where he had taken decades of observations of all the planets, Mars, Jupiter, every night, at least every night for which the weather was clear. And so he had the naked eye, actually, because he was the last of the naked eye astronomers. And so he had all this data, which Kepler could use to confirm his theory. And so Kepler started working with Tycho, but Tycho was very jealous of the data. He only gave a little bit of a bit of a bit of a time.
“And I think Kepler eventually just stole the data, actually, he copied it and had to have a fight with Brahi's descendants.”
But he did work out, he took a data, and then he worked out to kind of his disappointment, that his beautiful theory didn't quite work. Like, the data was off from his patronic solid theory, about 10% or something. And he found all kinds of fudges and moving the circles around and things that it didn't quite work. But he worked on this problem for years and years. And eventually he figured out how to use the data to work out the actual orbits of the planets.
And that was incredibly clever genius amount of data analysis. And yeah, and then he eventually worked out that the also actually ellipses, not circles, which was shocking to him. And then he worked out, so he worked out the two also planetary motion ellipses, also equal areas, super-out, and equal times.
And then 10 years later, after collecting a lot of data, the furthest planets, like satin and Jupiter were the hardest for him to work out.
But then he finally worked out this third law, also,
that the orbits, the time it takes for a planet to close orbit, was proportional to some power of the distance to the sun. And these are the three famous couple's laws of motion, and he had no explanation for them. It was just all driven by experiment.
And it took Newton a century later to give a theory that explained all three laws it once. The take I want to try on you is that Kepler was a high-temperature alarm. Where Newton comes up with this explanation of why the three laws of planetary motion must be true. And of course, the way the Kepler discovers the laws of planetary motion,
or figures out the relative orbits of the different planets, as you say at work of genius. But then, you know, there's career, he's just trying random relationships.
And in fact, in the book in which he writes down the third law of planetary motion,
it's sort of, on the side, on the harmonics of the world, which is a book about, you know, all these different planets have these different harmonies, and the reason there's so much famine and misery on earth, is because the earth is mefamy, that's the note of earth. And so all this random astrology, but in there is the Kepler law,
which tells you what relationship the period has to a planet's distance from the sun,
Which is, as you're detailing, if you add that to Newton's F equals M A,
and then the equation for centripetal acceleration, you get the inverse square law. And so Newton works that out, but the reason I, I think this is an interesting story, is I feel like elements can do the kind of thing of like 20 years, let's try random relationships, some of which make no sense. As long as there's a verifiable data bank, like brought his data set,
where, okay, I'm going to try out random things about like musical notes, I'm going to try out random things about platonic objects. I'm going to, all these different geometries have this bias, that there's some important thing about the geometry of these orbits, and one thing works, and as long as you can verify it,
it can then draw, these empirical regularities can then drive actual deep scientific progress. Traditionally, when we talk about the history of science,
idea generation has always been kind of the prestige part of science.
So I mean, scientific problem comes with, there's many steps,
“you know, you have to identify a problem,”
and then you have to identify a good problem to work on, a fruitful problem. And then you need to collect data, you need to figure out a strategy to analyze the data, to make hypothesis, and that's going to need to propose a good hypothesis, and then need to validate. Yeah, so this, and then need to write things up and explain.
There's, there's, there, it doesn't different components, but yeah, the ones we celebrate are these of Eureka genius moments of ad generation. And, yeah, so, so Kepler certainly had to, as is a cycle through many ideas, and several which didn't work, and I bet many that he didn't even publish at all.
Because they didn't, they just didn't fit. And that's an important part of the process, trying all kinds of random things and seeing if they worked. But as you say, the, you know, it, they have to match by an equal amount of verification.
Otherwise, it's, it's small. I mean, we celebrate Kepler, but we should also celebrate Brahi for his, his, his, his, a serious data collection with, which was ten times more precise than in any previous observation.
“And that extra decimal point of accuracy was actually essential for Kepler”
to get his, his, his results. And, you know, and he was using, you know, Euclidean geometry, and like, like, the most advanced mathematics he could use at the time to match his, his, a model of the data. So, like, all aspects had to be in play.
You know, the, the data and the theory and the, the, the hypothesis generation. I'm, I'm not sure nowadays that hypothesis generation is the bottleneck anymore. Sciences has changed in, in the century since.
So, um, classically sort of the, the two big paradigms for, for, for, science for theory and experiment. Um, then in the 20th century, um, numerical simulation came along. And so he can also do, do computer simulations of, of, of, of, of, of, of, to test theories.
Um, but then finally in the late 20th century, we had big data.
No, we, we had the, the, the era of data analysis. Um, and so a lot of new progress is actually driven now by analyzing massive data sets first, collecting large data sets. And then drawing the patterns from them to, to, to, to do slots, which is a little bit different from how science used to work where you,
you make a few observations or you just have one out of the blue idea. And then you collect data to test your idea. That's the classic scientific method. Um, now it's almost reverse. You collect big data first.
And then you, you try to, to get hypotheses from it. Um, I mean, Kepler was maybe one of the first early data scientists, but, but even, even he didn't start with type of, um, um, cycles data set and, and analyze it.
He had, he had some preconceived theories first.
But it's, it seems that this is less than this, the way we make progress in, in, um, yeah, just because, uh, yeah, the data is, it's just, it's so much more massive. It's just so much more useful. Um, oh, interesting. Yeah, if, actually, if you're like the mold of 20th century science that you're describing is actually,
very well at describes what happened in Kepler where he did have these ideas. Um, 1595 and 96 is where he comes up with the first polygons and then, uh, platonic objects theory. But they were wrong. And then a few years later, he gets brought his data.
And it's only after 20 years of just trying random things that he gets this empirical regularity. And so it actually feels with closer to brought his data is analogous to some massive data vanic of simulations. And then, you know, now that you got the data, you can keep trying random things. But if it was, and Kepler would be out there just writing books about harmonics and
the platonic objects and there would be nothing to actually verify against.
“Yeah, yeah, so the, the, the data was extremely important.”
Um, but the distinction I was trying to make was that sort of traditionally you, you make a hypothesis and then you tested against data. Yeah. Um, but um, now with machine learning and data analysis and statistics and some of the, you can, you can start with data and, um,
who say statistics work out, um, um, laws that, um, were not person before.
So Kepler, so Kepler's third laws a little bit like this,
except that for the third law, instead of having the thousand data points that brought he had, Kepler had like six data points. Um, like every planet, you knew the length of the orbit and the distance to the Sun and there was like five or six data points. And he did, uh, what we would now call regression.
You know, he could fit a curve to the six data points.
And he got a square coup law, which was amazing.
Uh, but actually, he was quite lucky. I mean, that these six data points gave them the right conclusion. Um, you know, it's, uh, that's not enough data to be really reliable. Um, there was a later astronomer, uh, Johannes border, um, who took the same, the same data actually, um, the, the distances to, to the planets.
“Uh, and inspired by Kepler, I think he had a prediction that the, the, the,”
the distances to the planet formed basically a shifted geometric progression. That he also fit a curve. Um, except there was, there was one, there was one point missing. Uh, so there's a big gap between Mars and Jupiter. Uh, his little predicted that there was a missing planet.
So, uh, it was a kind of a crank theory, except, um, when you aim us to discover it by her shell, the distance you aim us fit exactly this, this pattern. Um, and then series was discovered, uh, this asteroid between, um, uh, I think, in the asteroid belt.
And it also fit the pattern. So people got really excited that, that, that, that, that board had discovered this,
this amazing new law, um, of, of nature.
Um, but then Neptune was discovered, and it was completely, like, way off. Um, and, um, and, you know, and, and basically it was just a numerical fluke. You know, there was, there was six, six data points. Um, yeah, so maybe one reason why Kepler didn't highlight his third law as much as the first two laws is that maybe it instinctively, even though we didn't have modern statistics,
he kind of knew that we've six data points. He had to be somewhat tentative. I see. Yeah. With, um, with the conclusions.
But maybe to ask the question about the analogy, more explicitly. There's this analogy makes sense to, if we have, you know, in the future, we'll have smarter and smarter AIs and we'll have millions of them. And then they can go out and hunt for all these empirical regularities. It sounds like you don't think the bottleneck in science is,
finding more things that are for each given field, their equivalent of the third law-planetary motion. So that then later on, somebody can say, oh, we need to wait to explain this. Let's work out the math. Here's the inverse, uh, square law of gravity.
Right.
“So I think AI has basically driven the cost of idea generation down to almost zero.”
Yeah. In a very similar way to how the internet drove the cost of communication down to almost zero. Yeah.
Um, which is an amazing thing.
It, you know, it doesn't make, it doesn't create abundance by itself. Um, yeah. So now the bottleneck is, it's different. Um, so we're now in a situation where suddenly people can generate thousands of theories, uh, for it's a, a given scientific bottom.
And now we've to verify them. You value it them. Um, and this is something which, we, we have to to change our structures of science to actually sort this out. So, you know, in fact, traditionally we, we build walls, you know, so in the past, you know, before we hit AI slop, you know, we, we had sort of amateur scientists, you know,
create, you know, have their own theories of the universe, uh, many of which, uh, were basically of very little value. Yeah. Um, and so we bought these, like, you know, peer review publication systems and things to kind of filter out. Um, and try to, to isolate the high signal, um, ideas to, uh, to test.
Um, but, uh, but now that we can generate these, these, these, these, these possible explanations at massive scale, um, and some of them are good and a lot of terrible. Um, I mean, human reviewers, we just, it's just, um, they're already being overwhelmed, actually. I mean, many, many journals are reporting AI journalists and missions. I just, I just, I just flooding their, their submissions.
So it's, it's great that we can generate all kinds of things now with AI. But it, it means that we have to, the rest of the, the rest of the aspects of science have to catch up. Uh, yes. So verification, validation, um, and, and assessing, uh, what ideas actually move the subject forward and, and what which ones are dead ends or, or, or, or red herrings.
Um, and that's, that's not something we've, we know how to do at scale. Um, you know, for each individual paper we can discuss it with, you know, I have a debate on scientists and get to a consensus in a few years. But when we generate, you know, a thousand of these every day, you know, this doesn't work. Yeah.
“So I think there is this incredibly interesting question.”
We have billions of AI scientists, not only how do you gauge which ones are real progress, but how do you, I mean, this is actually a question that human science has had to face and we've solved somehow. And I, I actually am not sure how we solved this, but in any given field. Let's say in their 1940s and there's, there's, if you're at Bell Labs or if you're just generally trying to, the, the, the, these new technologies coming out of, uh,
pulse code modulation and basically how do you transfer signals? How do you digitize signals? How do you transfer them over analog wires? And then there's like all these papers about the engineering constraints there and the details, and then there's one, which is like comes up with the idea of the bit, which has implications across many different fields.
And you need some system, which can then look at that and say, okay, we need to apply this probability. We need to apply this to computer science, et cetera. And for, in the future, the AI is coming up with, you know, the next version of this kind of unifying concept.
How would you identify it among millions of papers, which might actually cons...
but which have much less general. Right. Unifying ideas. So a lot of it's the test of time.
Um, so so many great ideas didn't actually get a great reception at the time that they were first proposed.
It was only after some other scientists realized that that they could take it further and apply them to their own. Um, you know, deep learning itself was like a niche area of AI for a long time. But there's the idea of of getting answers entirely through training on data and not through first principles, you know, reasoning was was was very controversial, and then they were just took a long time before that he started bearing fruit. You know, you mentioned a bit, you know, I mean, there were other proposals for computer architectures, then the zero one that is universal today.
“I think that there were there were trits, you know, zero one, you know, three valued, more chicken.”
You know, in an alternate universe, maybe a different paradigm would have, would have showed up. People argue that, you know, the transformer, for example, is the foundation of all modern large language models. And it was the first, um, deep learning architecture that really was was sophisticated enough to capture language, but it didn't have to be that way. There could have been some other architecture that, um, was the first to do it. And once that was adopted, it would become the standard.
So I think one reason why it's hard to assess whether a given idea is going to be fruitful is that it depends on the future. It depends on, and it depends on also on the culture in society, like which ones get adopted, which ones don't. You know, um, the base 10, um, the new system in mathematics extremely useful, much better than the Roman new system, for instance. But again, there's nothing special at 10.
“It's, it's a system that we, it's useful for us because everyone else uses it.”
And we've standardized it and we've built all our computers and our number of representations systems around it. And so we're stuck with it now actually, um, you know, people are, some people occasionally push for other systems than decimal, but, um, it's this, this, this is no, this is no, uh, this too much inertia. Um, so you can't look at any given scientific achievement, purely in isolation and give it an objective grade. Uh, without being aware of the context both in the past and the future.
And so it, it may never be something that you can just reinforce and learn the same way that, that you can for much sort of more localized problems.
Um, yeah. It seems often in the history of science when what, when a new theory comes up that in retrospect, we realize it's correct. It seems to make implications that just either make no sense because they're wrong, and we realize later on rather wrong, or they're correct by gene, seem wildly impossible at the time. So, as you've talked about Eric Starkus, uh, had heliocentrosum in the third century, uh, BC and then, um, the ancient Athenians were like, this can't be because it would, if the earth is going around the sun, we should see the relative position of the stars,
change as we're going around the sun.
“And the only way that wouldn't be the case is if they're so far away that, um, that you don't notice any parallax, which is actually the correct implication.”
But there's times when then, actually, the implication isn't correct and we just need to graduate to a better level of understanding. So, life needs would, you know, try Newton and disagree with Newton's your gravity on the basis that it implied action at a distance. Um, and then there's, we don't know the mechanism and, um, Newton himself was sort of stunned that inertial mass and gravitational mass were the same quantity. So, all these things were, they were resolved by Einstein. Yes, yes, but it was still progress. And so, the question for, in system of peer review for AI would be,
even if you can falsify a theory, how would you notice that it's still constitutes progress relative to the thing before? Yeah, so it often actually, the, the ultimately correct theory initially is, is worse in many ways. Um, yeah, so Copernicus's theory of, of the planets, it was less accurate than Tom Lee's theory. Yeah, so, so geosentrism had been developed for, um, you know, a millennium by that point. I had, they had made many, many tweaks and, and at very increasing complicated ad hoc fixes to make it more and more accurate.
Um, and Copernicus's theory was a lot simpler, but, but much as accurate, there was only a couple that made it more accurate than Tom Lee's theory.
I mean, science is always a work in progress, you know, so, yeah, so when you only get part of the solution, uh, it, it looks worse than, then, at a theory which is incorrect, but somehow,
if it has been completed at really point where it kind of answers all the questions. As you say, Newton's theory had had big mysteries, they quote the Somers and action at distance, which were only resolved with the very conceptually different approach centuries afterwards. Often progress has been made, not by adding more theories, but by deleting some assumptions, that you have in your mind. One reason why geocentrism held not for so long is we had this idea
That objects naturally want to stay addressed.
And so the idea that the Earth was moving, how can we want also falling over?
“Once you have Newton's motion, object motion remains in motion, so forth, then it makes sense.”
But you had to do so conceptually, it's a very big conceptually to realize that the Earth is in motion, it doesn't feel like it's in motion. The biggest advances, Darwin's theory of evolution is the idea that species are not static, but it's not obvious because you don't see evolution in your lifetime. Well, now we can't actually can, but it seems permanent and static. Now, right now we're going through an cognitive version of the Copernican Revolution,
where we think that human intelligence is the central universe, and now we're actually seeing those very different types of intelligence that are out there, we're very different, stretching weaknesses, and so our assessment of which tasks require intelligence, which ones don't, has to be re-ordered quite a bit. And so you know, it's trying to fit AI into our theories of scientific progress and what is hard and what is easy. We're struggling quite a lot,
but we have to ask questions that we've never been asked before. Or maybe the philosophers had,
but now we all have to deal with it. This actually brings up a topic I've been very curious about. So you mentioned Darwin's theory of evolution. There's this book, The Clock of Universe by Edward Dalnek, which covers a lot of this era of history we're talking about. And he has this interesting observation in there that the origin of species is published in 1859. The Principia Mathematica is published in 1687. So the origin of species comes out basically two centuries after the Principia. And conceptually,
it seems like Darwin's theory is simpler. There's a contemporaneous biologist to Darwin who reads the origin of species Thomas Huxley and says, "How stupid not to have thought of that?" And nobody ever says about Principia. The Shiding themselves are not having beaten Newton to gravity. And so there's a question of, "Well, why did it take longer?" It seems like a big part of the reason is that the evidence for natural selection is cumulative and retrospective, whereas Newton can just like,
here's many equations. Let me see the moon's orbital period and it's distance. And if it lines up, then we made progress. And so Lucretius actually had the idea, the idea that species adapted their environment in the first century, BC. But nobody ever really talks about it until Darwin, because Lucretius can't run some experiment and people are forced to pay attention. And so I wonder if we'll in retrospect end up seeing much more progress in domains, which are, have this kind of
tight data loop where you can verify them quite easily, even though they're conceptually
“much more difficult. I think one aspect of science is it's not just creating new theory and”
validating it, but communicating it to others. So Darwin was actually an amazing science communicator.
He wasn't in English in natural language. So he's like, "Okay, my, yeah, okay." I have to just sort of get out of my technical mindset. He's spoken plain English. It didn't use equations. And he's synthesized a lot of, you know, disparate fact, yeah. So, you know, little pieces of evolution had been worked out in the past, but he had this very compelling vision. And again, still missing things like he didn't know that the mechanism for
hereditary, he didn't do DNA. Yeah. But his writing style was persuasive and that helped a lot. Newton wrote in Latin. He invented, you know, entire new eras of mathematics just to explain what he was doing. He was also from an era, which was where scientists were much more secretive and competitive. So, you know, academia is still competitive, but it was even worse back in Newton's day. So, he, he held back some of his best insights because he didn't want
his rivals to get any advantage. He was also, I case, someone I'm a person, person, from what I, what I, what I gathered actually. So, it was actually only a couple of decades after Newton, where other scientists explained his work in much simpler terms that they became a widespread. So, yeah, the, the, the, the, the other exposition and making a case and creating a narrative
“is, is also a very important matter science. And if you have the data and it, it, it helps,”
but, but people need to be convinced otherwise they will not push it further. Or they want to, in taking initial investment to to learn your theory and really, and really explore it. And that's another thing which is really hard to reinforce and learn on. Yeah, how, how can you score out the Swiss if you are? Okay, well, okay, there's the entire marketing department so we're trying to do this. So, maybe it's good that AI are not yet optimized to be persuasive.
So, yeah, there's, there's, there's a social aspect to science.
pride ourselves on having an, an objective side to it, where there's data and there's experiment
“and validation, we, we still have to tell stories and convince our fellow scientists. And that's”
a soft squishy thing, like, it's, you know, it's, it's a combination of data and, yeah, and painting a narrative. And, and it's been out of gaps, you know, I mean, that's, you know, so, so even Darwin, as I said, there were pieces of steel that you could not explain, but he could still make a case that, you know, in the future, people would, would, would find transitional forms that they would find the mechanisms of inheritance. And they did it.
Yeah, I don't know how you can quantify that in such a precise way that you can start to reinforce something and maybe that will be forever, the human side of science. One takeaway I had from reading and watching yourself on the cosmic distance letter. By the way, I highly, highly highly recommend people watch your series with through the one round on
“the cosmic distance letter. But, um, one takeaway was that the deductive overhang in many fields”
could be so much bigger than people realized where if, if you just said the right inside, about how to study a problem. You might be surprised at how much more you could learn about the world. And I wonder if you think that sort of a product of astronomy at the particular times in history that you're studying, or is this that based on the data that is incident on the Earth right now, we could actually define a lot more than we happened to know. All right, so astronomy was
one of the first sciences to really embrace data analysis and, and, and squeezing every
last possible drop of information out of any information that had because because data was the bottom deck. Um, I mean, it still is the bottom. I mean, it's it's really hard to collect astronomical data. So astronomers are the best, you know, almost, uh, world class in, in extracting, you know, almost like Sherlock, you know, it says like like extracting all kinds of conclusions from little traces of data. Um, I hear that that a lot of quantum hedge funds, uh, they they they they
preferred high as an astronomy PhD. I'm just excited that they also are interested for other reasons in extracting signals from from from various random bits of data. Okay, speaking of clever ideas, one of my listeners, Sean, solved the puzzle that Jane Street made for my audience and posted a great walkthrough on X. For context, Jane Street trained to resonate, and then shuffled all 96 layers, and then challenged people to put them back in the right order, using only the model's
outputs and training data. You can't brute force this. There's more possible orderings than
Adam's in the universe. So Sean brought the problem into two different parts. First, pair the
layers into 48 different blocks, and second, put those blocks in the right order. For pairing, Sean realized that in a well trained resident, the product of two weight matrices in a residual block should have a distinctive negative diagonal pattern. And this arises as a way for the model to keep the residual stream from growing out of control. From this insight, he was able to recover the right pairings. For ordering, Sean noticed that the model seemed to improve if he sort of
the blocks by the size of the residual contributions. Starting with that rough approximation, he combined a clever ranking heuristic with local swaps to recover the exact right order. His full walkthrough is linked in the description. Don't worry if you didn't get to this puzzle in time though. There's still one up about factored algorithms that even Jane Street doesn't know how to solve. You can find it at Jane Street.com/torcaesh. All right, by deterrence.
We do undo explorers sort of how to extract extra information from various signals.
“Just to pick one random style, I remember reading once that people had discovered,”
we're trying to measure how often scientists actually read these citations that the people are inside. How do you measure this? You could try to survey different scientists, but they had some clever tricks. So many citations have little typos, like a number as wrong or a contribution is wrong. They measured how often a type work got copied from one reference to the next, and they could infer whether an author was copying or cutting a pasting reference without
actually checking it. From that, they were able to infer some measure of how much attention people were playing. So there are also clever tricks to extract. So these questions you posed earlier of how can we assess whether an scientific development is fruitful or interesting or representative progress? Maybe there are really useful metrics and footprints of this phenomenon. We can examine citations and how often something is mentioned in the conference or
Something.
actually detect these things. Yeah, but we usually get the most wrong ones on the case actually.
“Okay, so I think this brings us nicely to the progress that from the outside, it seems like”
AI format is making. And I think you had to post recently we pointed out that over the last few months, AI programs have solved 50 out of the 1100 odd or those problems, but then I think I don't know if it's still correct, but as of a month ago you said that there had been a pause
because the low hanging fruit had been picked. First of all, I'm curious of actually that it's
still the case that we have picked the low hanging fruit, and now we're at this plateau currently. It does seem so. I mean, it's so activity at the other... Yeah, so 50 odd problems have been solved with AI systems, which is great, but there's like 600 to go. And people are still chipping away at one or two of these right now. We are seeing a lot fewer sort of pure AI solutions now, where they are just one shot at the point. So there was a month where that happened, and that has stopped.
“Not for lack of trying. I know three separate attempts to get frontier model AI to just attack”
every single one of the problems somewhat seriously. And they've picked out some minor observations or maybe they've found that some problems I've already saw from the literature, but there hasn't been any further AI purely powered solution yet. People are using AI a lot, currently. So someone might use AI to generate a possible proof strategy, and then another personal reason at separate AI tool to critique it, or rewrite it, or generate some numerical data for it, or do the literature
survey. And some problems have been solved by ongoing conversation between lots of humans and lots of AI tools, but it does seem like it was this one-off thing. So maybe one analogy to police problems, it's like imagine like there's all these, there's some sort of mountain range of all kinds of cliffs and walls. And maybe there's a little wall, which is maybe like three feet high, and one this six feet high, and there's 15 feet high, and then there's there's some mouth high cliffs.
And you're trying to climb as many of these cliffs as possible, but it's in the dark. We don't know which ones are tall, which ones are short. And so we try to light some candles and make some maps, and slowly we kind of figure out that some of them are climbable, some of them we can identify some partial track in the wall that you can reach first. And then these AI tools, they're kind of like these jumping machines that can kind of jump, you know, two meters in the air, you know,
higher than any human, and sometimes they jump for the wrong direction, and sometimes they crash, but sometimes they can reach the tops off of the lowest, you know, walls that we could reach before. And so we first basically set them loose in this mountain range, hopping around, and then there's this exciting period where they could actually find all the low ones and they could reach them. But then there's been no, I mean, maybe if the next time there's a big advance in the
models, then they'll try it again, and maybe a few more will be, will be breached. But it's a different style of doing mathematics, then sort of the, you know, so normally we would to help climb it, and, you know, we would make little markers and try identify partial things. And, you know, these tools they either succeed or they fail, and they've been really bad at creating sort of
“partial progress or identifying intermediate stages that you should focus on first. Again,”
going back to this previous discussion, you know, we don't have a way of evolving partial progress. The thing that we could, you can evaluate a one shot success or failure of solving a problem. So there's two different ways to think through what you've just said, and one of them is more bearish on the eye progress and one of them is more bullish and bearish on being, oh, they're only getting to a certain height of wall, which is not as high as humans are reaching. And the second is that, well, they
have this powerful property that once they achieve a certain water line, they can fill every single
of a problem that is available at that water line, which we simply can't do with humans where we can't make a million copies of you and give each of them a million dollars of inference compute and have you do a hundred years of subjective time research on a hundred different problems at the same time or a million different problems at the same time. But once AI's reached parents how a level, they could do that. And once they reach intermediate levels, they could do,
they could do the intermediate version of that. So the same reason that we should be bearish now is the reason we should be especially bullish, not even when they achieve superhuman intelligence. But just when they achieve human level intelligence, because they're human level intelligence is
qualitatively wider and more powerful than our human level intelligence.
I agree.
Yeah, so I think they're very complementary. But our current way of doing math and science is focused on depth because that's where the human expertise, of course humans can't do breadth. But yeah, so we have to redesign the way we do science to take full advantage of this breadth capability that we now have. So as I said, we should have a lot more effort in creating very broad class of problems to work on rather than one or two really deep important problems.
I mean, we should still have the deep important problems and humans should still be working on them. But now we have this other way of doing science. I mean, we can explore entire new
field of science by first getting these broad, moderately competent AI to sort of map it out
and clear out all the easy observations. I then identify certain islands of difficulty, which then human experts can come and work on. So I see very much a future of very complementary science. Eventually, you would hope to get both breadth and depth, and
“somehow get the most best of best of both worlds. But I think we need practice with the breadth”
side, it gets such as too new. We don't even have the paradigms really to make full advantage of it. But we will. And then science will be unrecognizable after that. To this point about complementarity, the programmers have noticed that they're way more productive
as a result of these the actuals. And I don't know if you as a mathematician feel the same way,
but it does seem like one big difference between vibe coding and vibe researching is that with software, the whole point of the thing is to have some effect on the world through your work. And if it leads to you better understanding a problem or you coming up with some clean abstraction to embody in your code that is instrumental to the end goal. Whereas maybe with the research, the reason we care about solving the millennium purpose problem is is presumably that
in the process of solving them, we discover new mathematical to objects or better new techniques and those who understand our civilization's understanding of mathematics. And so the proof is sort of instrumental to the intermediate work. I don't know if you agree with that dichotomy or that in any way we'll explain the relative uplift we'll see in software versus research. Right. So certainly in math, the process is often more important than the problem itself.
The problem is kind of a proxy for measuring the progress. And I think even in software,
there's different types of software tasks. I mean, if you just kind of create a web page, that does the same thing that our class and other web pages do, there's sort of no skilled be learned. Well, there's sort of some school maybe that the individual program I could pick up. But
“for the kind of borderplate type code definitely, it's somehow you should definitely also do AI.”
But sometimes you make the code, you still can maintain it and this issue is for upgrading it and making compatible with other things. And I think I've heard that that program is our reporting, you know, that even if in AI I can create the first prototype of a tool making it mesh wherever everything else and making it interact with the real world in the way they want. I mean, that's an ongoing process. And if you didn't have the skills of that you pick up from from writing the code,
that may impact your ability to maintain it down the road. So certainly, mathematicians, you know, if we've used problems to build intuition and to train people, to have a good idea as what's true, what to expect, what is proveable, what is difficult. And so, just getting the answers right away may actually inhibit that process. So, I mean, I mean, just think nutrition theory and experiment before. So in most sciences there's an equal division between there's a theoretical
side and experimental side. But in math has been almost unique because it's almost entirely theoretical. We pay the premium on sort of trying to have coherent clean theories of why things are true and and false. And we haven't done much experiments as to like, you know, maybe we have two different ways to solve a problem, which one is more effective. We have some intuition, but we haven't done large scale studies where we take a thousand problems and we just test them.
“But we can do that now. So I think AI type tools we really will actually revolutionize the experimental”
side of math where you don't care so much about individual problems and the process of solving them.
You want to gather just large scale data about what things work, what things ...
Same way that if you want to, if you're a software company and you want to roll out a thousand pieces of software, you don't really want to handcraft each one and learn lessons from each, you just want to find what are the work flows that you scale. So we don't yet, the idea of doing mathematics at scale is at its infancy, but that's where AI is really going to revolutionize this object. Interesting. I feel like a big crux in these conversations
“about how much, how good AI will be for science is, I think you've said this,”
like they're using existing techniques and modifying them. And it would be interesting to understand how much progress one can make simply from using existing techniques. Like how much of, if I looked at the top mat journals, how many of them are, how many of the papers are coming up with, whatever coming up with the technique means doing that versus using existing techniques in new problems. And what the overhang is, where if you just apply it every no-in technique to
every open problem, would that just constitute a humongous uplift in our civilization's knowledge, or would that not be that impressive and useful? This is a great question. We don't have the data to fully answer it yet. Certainly a lot of work that human
mathematicians do, you know, when you take a new problem on the first things we do,
which we just find, we look at all the standards things that have worked on similar problems in the past, and we try them on by one. And sometimes that works, and that's still worth publishing sometimes because the question was important. Sometimes they almost work and you have to add one more wrinkle to it, and that's also interesting. But then, you know, the papers are going to flip the top journals, are usually ones where you, you know, the existing methods
can kind of solve your 80% of the problem, but in that reason, this is 20% which is resistant, and a new technique has to be invented to fill in the gaps. It's very, very, very now that a problem gets solved with sort of no reliance on past literature, where all the ideas come out of, of, of, of, of, of no way. You know, that was more common in the past, but the math is so mature now, that it's, it's, it's just so much of a handicap to, to, to not use
literature first. So, yeah, AI tools are really good at, they're getting really good at the first
part of it, just trying all the standards to click on a problem often now actually making fewer mistakes, including them, then, then, then human states, they still make mistakes, but, but, um, I've, I've tested these tools, you know, on, on, on, like, little tasks that I can do, and sometimes they pick up errors that I make, sometimes I pick up errors that they make. It's about a tie right up, uh, but, um, yeah, I haven't yet seen them take the next step, you know, so,
so, when, when there are holes in, in, in the argument where none of the things are working, to, to, to, how, then what do you do? Um, and then they can kind of suggest random things and it, it, it, it, um, often I find that trying to chase them down to make them work, and finding
“they don't work, it weighs more time than it saves. Yeah. So, um, now, so, I think some fraction of”
problems that we currently think are hard will, will fall from this, this method, um, I mean, especially the ones that haven't received enough attention. Um, so, like, with the early problems, you know, like, almost all of the 50 problems that were solved by AI's were ones for which, basically there was no literature. I mean, it was supposed to form more so twice. Um,
I think maybe some people try to casually, and they couldn't do it, but they never wrote
up anything. Um, uh, but it turned out that there was a solution, and it was just, you know, maybe combining with this one obscure technique that, that not many people know about with some other, without literature, and that's the kind of, the, the median level for what AI can accomplish. And that that's really great. It clears out 15 of these problems. Um, so, I think you all see some isolated successes. Um, but this six, but what we found, so people have to have done
“large scale sweeps of these early problems. And like, if, um, if you want to focus on the success”
stories, the ones that get, they get broadcast and social media that looks amazing, you know, like, they, all these problems that haven't been so before for decades, now that, now they're falling. Uh, but whenever we do a systematic study, um, any given problem, an AI tool has a success rate of maybe one or two percent, uh, it's just, it's just that they can buy a scale and maybe just pick the winners, it looks great. So I think it'll be a similar thing happening with, um, you know,
there's there are hundreds of, of, of, of really prestigious, difficult math problems out there. A couple may make, um, you know, some AI may get lucky, and I keep sold them, and there's those, some, some, some backdoor to solve the problem that, that everyone else missed, um, and that will get a lot of publicity. Um, but then people will try these fancy tools on their own favorite problem, and they will, again, experience that one to do for success rate. Right. So, um, there'll
be a lot of noise amongst the signal of sort of when they're working, when they're not, um,
We have to do, yeah, it's, it's, it's increasingly important to collect these...
data sets. You know, there are efforts now to create a standard set of challenge problems for, uh, for AI to solve, um, and not just rely on the AI companies to only publish their winds, and, and, and, and, and not to disclose the, the negative results. Um, so that will put maybe
“get more clarity as to, uh, where, where we actually at. Well, I think it's worth it, but I think”
how much progress in the AI constitutes already to have models that are capable of applying some technique that nobody, yeah, had written down is applicable to this particular problem. The
progress is simultaneously amazing and disappointing. It is, it is, it is a very strange feeling to,
to see these tools in action and, and, you know, um, but it will also be, put a climateized really quickly, um, you know, I remember when, when Google's web search came out 20 years ago, and it just blew all the others to all that such as other water, like you were just getting relevant hits on the front page, like, perfectly, they're almost, yeah, exactly what you wanted, and it was amazing. And then after a few years, you just took a look around that, that you could,
you could just Google anything. Um, and yeah, so a lot of, yeah, I mean, 2026 level AI would be stunning in 2021, and a lot of it, you know, face recognition, natural speech, uh, yeah, doing, you know, college level math problems, we just take for granted. All right, yeah. Okay, so speaking of
“2026, yeah, you made a prediction in 2023, then I think, by 2026, what was it that it would”
be, like, like, the colleague in mathematics or, yeah, I trust with the co-authent, if used correctly, um, we're just looking pretty good in retrospect. Yeah, I'm, I'm pretty pleased. Yeah. So, you know, let's, let's even continue this streak. Um, you personally are too ex-more productive as a result of AI, what year would you say that? Um, yeah, so productivity, I think is not quite a one-dimensional quantity. Um, like, I'm definitely noticing that the style in which I do mathematics is changing quite
a bit, and the type of things I do. So, for example, my papers now have a lot more code, a lot more pictures, um, um, uh, because it's so easy to generate these things now. So, some plot, which I've taken me hours to do now, I can do in minutes, but in the past, I just wouldn't have put the plot
in, if I'm here with the first place, I would just talk about it in words. Um, so it's hard to
measure what 2x means. Um, so yeah, on the one hand, you know, I think the type of papers that I would
“write today, if I had to do them without AI assistance, they would definitely take five times longer,”
but interesting. But I would not write my papers that way. Five x, so yeah, but it's, it's because that, that these are sort of auxiliary, I mean, you know, the, you know, so things that, yeah, things like, like, um, like, doing a much deeper with literature search, um, that, so, supplying a lot more numerics, um, I mean, they, they, they, they, they enrich the paper. Um, so, yeah, the, um, the, um, the core of what I do, like, actually solving, um, the most difficult part of a, of a math problem,
that hasn't changed too much, that's to use kind of paper for that. But, um, you know, um,
this is, that's a, that's a lot of sort of sort of things I, I use, um, an AI agent now to, to reform
like, like, sometimes, uh, or my parentheses are not quite the right size, you know, I use the manually changing the my hand and that I can get an AI agent to sort of do all that quite nicely now in the background. Um, so, yeah, they, they really sped up lots of secondary tasks, uh, they haven't yet sort of, um, sped up the, the, the core thing that I do, but it's, it's allowed me to sort of add more things to, to, to my papers. Um, yeah, but, um, but, um, by the same token, like, if I were to
write a paper I would in 2020 again, and not add all these extra features, but just have something of the same sort of level functionality, you know, then that he doesn't have, I shouldn't save that that much, uh, to be honest. Uh, yeah, so it's made, made the papers sort of richer and broader, but not necessarily deeper. Mm. You made this distinction between artificial, cleverness and artificial intelligence, and I would like to better understand those concepts.
Well, what is an example of, um, uh, intelligence that is not just cleverness. Yeah, so, um, it's, it's intelligence is famously hard to define. It's only things that you can't ignore when you see it. Um, but, when I, when I, when I talk to someone, um, and we trying to, it's collaboratively solve a math problem together. Um, there's this conversation where, you know, we need of us knows how to solve a problem, um, initially, but, um, one of us has some
idea, and, and it looks promising. And, and so then then we have some sort of prototype strategy, and then we test it, and then it doesn't work, but then we, we modify it, and there's some adaptivity and, um, and, and, and, and, uh, continue improvement of, of, of the idea of a time, and eventually, um, you know, we sort of, we've, we've, we've since like mapped out what doesn't work, what does work, and, and, and we can kind of see a path forward, but it's evolving with our discussion.
Um, and this isn't, not quite what the AI is.
so to go back to this analogy of all of these jumping robots, you know, so, um, you know, they
can jump and fail and jump and fail and, and jump and fail, but what they kind of is, they kind of, they jump a little bit, and they, they reach some handhold, but, and then they sort of stay there, and then they pull out people up, and then the intelligence jump from there. Right. Um, there, there isn't this cumulative process, which is, uh, sort of build up interactively. Um, it, it, it seems to be a lot more trial and error and just repetition before. Um,
um, you know, which can, you know, it scales and it can work amazingly well in, in certain contexts. But yeah, this, this, this, this is sort of building up cumulatively from, um, from partial progress, is kind of, is what's still not quite there yet. Interesting, you're, say, if Gemini three or Claude four, point five, whatever, solves the problem. Yeah. Um, it is not the case that it's own understanding of math that's progressed, or even if it works on a problem without solving it,
it's not that it's own understanding of, yeah, it doesn't matter if it's progressed. Yeah, you, you want a new session is forgotten what what it just did, um, it had just, you know, it has no new skills to, to attach to, to, to, to, to, to, to build on, on, on, on, on related problems. Um, maybe, what you just did is part of one zero point zero is a one percent of the training data for the next generation. So maybe eventually somebody gets absorbed, but yeah.
So Terrence talks about the importance of decomposing particularly in the early problems into a series of easier chunks. Even if this doesn't resolve in the full solution, approaching problems in this way helps you build up the intuitions and practice the techniques that you'll need to keep making progress. But models today tend to struggle with these kinds of problems solving techniques. That's where level box comes in. Level box helps you train models, not just to get the right answer,
but to think the right way. The operationalized reasoning behaviors into rubrics, giving you the ability to evaluate every important dimension of a model's output.
“These rubrics go beyond simple correctness. Did the model reach for the right tools?”
Did it check its own work and explore alternative paths? How clear was its response? These skills are useful across domains. Math, physics, finance, psychology, and more, and the becoming increasingly important as models take on harder, open-ended problems, some of which have multiple solutions and some of which we don't even know the solutions too. Labelbox can get your rubrics tailored to your domain. Helping you systematically measure and shape
how your models think. Learn more at labelbox.com/logesh. One big question I have is how plausible is it that if we just keep training AI as a get better and better at solving problems in lean, that they will continue to solve more and more impressive problems and then we will in retrospect be surprised at how little insight be God from some lean solution to proving the rebound hypothesis or something.
What do you think? It is a necessary condition also in the rebound hypothesis, even by any eye that is totally doing it in lean. The constructions which are made, the definitions which are created even in the lean program, have to advance our understanding of mathematics.
“So, what do you think it could just be, as Emily could, could go build a cook?”
We don't know. Some problems have been basically solved by pure brute force. A full color theorem is a famous example. We have still not found a conceptually elegant proof of this
theorem. It basically and maybe we never will. Some problems may only be solved by just splitting
into some enormous number of cases and doing brute force and then cycle computer analysis on each case. I mean, part of the reason, we, we prize problems like we were in a hypothesis that we are pretty sure that something amazing has to, a new type of mathematics has to be created or a new connection between two previous and connected to the areas of mathematics has to be discovered to make this work. We don't even know what the shape of the solution is, but it doesn't
feel like a problem that will be solved just by exhaustively checking cases or something. I mean, it could be false actually. We could actually, okay, there is it unlike this scenario that that the hypothesis is false and then this, you can compute, oh, here's a zero off the line and a massive computer calculation verifies it. That would be very disappointing. I don't know. I do feel that fully autonomous one-shot approaches are not the rider approach
“for these problems. I mean, I think you'll get a lot more mileage out of the”
interplay between humans collaborating with these tools. And I can see one of these problems being
solved by some smart humans assisted by some extremely powerful AI tools. But
the exact dynamic may be very different from what we envisioned right now. I mean, it could be a collaborative collaboration with a type that we just doesn't exist yet. Yeah, I mean, there may be a way to generate, you know, a million variants,
Luminous data function and do some data analysis.
some pattern between connecting them, which we didn't know about before and written. This
lets you transform the problem into a different area of mathematics. I mean, they could be all kinds of scenarios. So suppose the AI figures it out and latent in the line is some brand new construction, which, you know, if you realize a significant, we would be able to apply it in all these different situations. How do you recognize it, right? Like, if you just, again, a very naive question, but if you come over to the equivalent of, like, the cart comes with the ACDA,
oh, you can have the score in the system where you can unify algebra and geometry. But in lean code it would just look like R to R, and it would look that significant or something. Or similarly, I'm sure there's other constructions which have this kind of property.
Well, the beauty of formalizing a proof in something like that is that you can take any piece of it
and study it atomically. So, you know, so when I read a paper with my humans with, which shows some difficult problem, you know, there's something that's some big sequence of learners in theorems and things. And so ideally, the author will talk it's talk their way through, you know, what's important, what's not, but sometimes they don't reveal what steps were the important ones in which one is just kind of boilerplate standard steps. But you can study each
limo in isolation, and some of them I can say, oh, this looks very standard, this, this, this, this was something I'm familiar with, I'm pretty sure there's nothing interesting going on here. But this limo, oh, that's, that's something I haven't seen before. And I could see why if you could, if you had this result, that would really help prove the main result. Like, you could, you know, you can assess whether something's up, I really sort of key to your argument or not.
And lean really facilitates that, you know, you can, you can, you can, you know, this individual
“steps are, I don't know, really precisely. I think in the future there'll be, you know, there'll be”
entire professions of mathematicians, if I take a giant, lean generator proof and maybe do some ablation or something else, try to remove steps of parts of it and try to find more elegant ways, you know, you know, maybe it's my other AI's to sort of do some reinforcement learning, how can you make the proof more elegant and, and, and maybe other AI's will grade whether this is this, this proof looks better or not. One thing that will change quite a bit in the near future
is, is that, and two, recently, writing papers was the most time consuming and expensive part of, of the job. And so you did it very rarely. You know, you, you, you, you, you only wrote up your results once, everything was, all the other parts of your argument were checked out and things, because it just, just rewriting it again, refactoring was just a total pain. But that's one thing that's become a lot easier now with modern AI tools. So, you know, you don't have to have just
one version of, of your paper, you know, you can, once you have one, you know, people can generate hundreds more. So, yeah, one giant messy lean proof may not be very meaningful or understand what it's on, but, but, but other book can, can refactor it and do all kinds of, of, of things with them. We have seen, if, with the other problem website, you know, the people will, and AI will generate a proof, and then he was 2,000 lines of code that, that verify the proof, but then we
people call it other AI's doing, summarise the proof, and then people write their own proofs. There's actually actually, um, post processing, or once you actually have one proof, we actually have a lot of tools now to deconstruct it and interpret it. It's a very nascent area of, of, of, of, of science or mathematics, but, um, I'm not as worried about, you know, so, so somebody will continue, or what if the real
“hypothesis proof will completely incomprehensible proof. I think once we have the artifact of a proof,”
we can do a lot of, of, of, of analysis on it. You posted recently that it would be helpful to have a formula or semi-forma language for mathematical strategies, as opposed to just mathematical proofs, which is what leans specializes in. I would love to learn more about what that would involve or look like. Um, we don't really know, um, I mean, uh, we've been very lucky in mathematics that out that we have worked out the laws of, of logic and mathematics, but this is actually a fairly
recent accomplishment. I mean, it was started by Euclid, um, you know, millennia ago, but, but only
in, like, the early 20th century did we finally, this talk here are the, the axioms of mathematics,
uh, well, the standard axioms of what, of course, you have see, and the axioms of first-autologic, and this is what a proof is, and, and, and, and this week managed to automate and, and, and, and have a formal language for, um, but there could be some way to assess plausibility of certain, you know, so you have a conjecture that something is true. Um, you test a few examples, uh, and it works out, and like, how does this increase your confidence that the conjectures true? We have a few sort of
mathematical ways to, to, to, to model this, a, a patient probability, for example, um, but they're not,
“uh, but they're, you know, you often have to, they often, uh, you have to set certain basis assumptions,”
and, and, and, and, and, it's, it's, there's a lot of subjectivity still in, in, in, in, um, in these times, so it,
It is, it's, it's, it's not clear, um, I mean, it's, this is more of a wish t...
then a plan to, uh, to about these languages, but just seeing how successful, having a formal framework
in place, like, lean has made deductive proofs so much, um, uh, easier to automate and, and, and train AI on. If there was some similar framework, yeah, so the, the bottleneck for using AI to, to create strategies and, and, and, and make conjectures is, we have to rely on human experts to, um, uh, and the test of time to, to validate whether something's plausible or not. If there was some semi-formal framework where this could be done semi-automatically in a way that, uh, uh, um, isn't sort of easily
“hackable, um, to, you know, I guess, so, of course, yeah, it's really important with these formal”
proofs, since that, that there's just no, um, uh, there's no back doors or exploits that, uh, that you can do to somehow get your, sort of, proof, without actually proving it, because reinforcements learning is just so, so good at finding these, these, these, these, these back doors. Um, but, um, yeah, if it's time for, like, I sort of mimics how, um, scientists talk to each other in a semi-formal way of, you know, using data and argument, but, but also, um, you know, contractin narratives and,
and, and, and, and, and, and, there's some, some, some subjective aspect of science, that we don't know how to capture in a way that, that, that, that, that, uh, we can insert AI into them in any, useful way. Um, so yeah, this is a, this is a future problem. Um, I mean, there are research efforts to, you know, to try to create automated, uh, conjectureers and, and, and, and, uh, and maybe there
“are ways to benchmark these and, and get some, some way to simulate this, but this is, it's all”
very, very new science. Can, can you help me get some intuition for, I have two sub-questions. One, it would be very helpful to have a tangible sense of, it would be helpful to have a specific example of what, uh, something like this would look like that the way scientists communicate that we can't formalize yet. And two, it seems almost, uh, definitely paradoxical to say, building up some narrative or building up some natural language explanation, and then also having
something like, you could have formalized and I'm sure there's some intuition behind where that overlap is, and I'd love to understand that better. All right, so, so an example of, of, uh, conjecture. So, um, Gals, um, was interested in the prime numbers, and, uh, he computed, he created one of the first mathematical data sets. He just computed the first 100,000 prime numbers, also, um, hoping to find patents. Um, and he didn't find a patent, but maybe not not
the patent he was expecting. He found a statistical patent in the prime, that he counted how many
primes there are up to 100, 1000, um, um, one million, so they get spasms spasms up, but the, the, the, the, the, the,
the drop-off in, in the density was inversely proportional to the natural logarithm of of of of of of of of the range of numbers. So, he can actually go in, now for the prime number zero. Um, the number primes up to X is like X divided by the natural log of X. Um, and he had no way to prove this. Um, it was it was data driven. Um, so this, this was a, a conjecture. Um, it was revolutionary for its time,
“because, um, it was maybe the first, really, an important conjecture of of math, that was such a,”
such a statistical nature, you know, so normally you're talking about patent like maybe the space in between the primes that has a certain regularity or something, but, um, yeah, but this was really something which it didn't tell you exactly how many primes there were in any given range. It just gave you an approximate approximation that got better and better as you, um, uh, went further and further out. But it, um, it, it helped. So it, it, it started the field of war,
quite an analytic number theory. Um, but it was the first in many conjectures like this many
which got proved, which sort of started, um, um, consolidating the idea that the prime numbers, actually didn't really have a patent that they behave like random, um, uh, random sets of numbers with a certain density. Um, I mean, they had some patterns, like they, they almost all odd. Okay. So there's, there's, there's something, and then they're not actually random. They were, it's called pseudo random. I mean, there's no random number generation involved in creating
the prime numbers. But, um, over time, it became more and more productive to think of the primes as, as if they were just generated by some, some, some, some god-rolling dice all the time and just creating this, this random set. Um, and this allows us to make all these other predictions. Um, so there's still open conjecture in, in a number theory because of the trend prime conjecture that there should be infinitely many pairs of primes that are twins, this is two apart, like 11
and 13. We can't prove that, and there's a few good reasons why we can't prove it. But, um,
Because of this statistical random model of the primes, we are absolutely con...
true. Um, we, we know that if, if the primes were sort of generated by flipping coins or something that we would, just by random charge, just like infinite monkeys at a typefight up, we would see, um, trend primes appear over and over again. Um, and we have over time developed this very accurate conceptual model of what the primes should behave like based on statistics and probability. Um, but it's all mostly heuristic and non-regorous, um, but extremely accurate. Um, so
the few times when we actually can prove things about the primes that has matched up with the predictions of this, well, what we call the random model of the primes. So we, we have this conjectural concept framework for understanding the primes that we, everyone believes in, and you know, it's the same reason why we believe that we don't have all the hypothesis true. Why we believe
that codography based on the primes is basically, um, it is mathematically secure. It's like
that it's, it's all part of this, this, this, this belief, um, in fact one reason why we care about the human hypothesis is that if the human hypothesis failed, um, we, we knew it was false. It means it would, it would be a serious blow to this model that that this, it would mean
“there's a secret pattern to the primes that we would not aware of. Um, and I think we would”
very rapidly abandon any cryptography based on the primes, because if it was one pattern that we didn't know that was probably more, and these pattern can lead to exploits in crypto. And yeah, it's, it's going to be, uh, it would be a big, big shock. Um, so we really want to make sure that doesn't happen. Um, so yeah, it's, um, so we've been convinced of of things like the human hypothesis and things of a time, but, uh, some of it is experimental evidence, some is, uh,
the few times we've been able to make theoretical results, they've always aligned. Um,
you know, it is possible that they've the consensus is wrong, and we've all just missed something very basic, um, you know, they've had been paradigm shifts in the past, in the scientific history. Um, yeah, but we, we don't really have a way of measuring this. Um, I think partly because we don't have enough data on, on, on how math will science develops. We do, we have one time and a history and the other way we have like, you know, 100 stories of turning points in history.
If, if we had access to a million aiming civilizations and each of the different development of our history and science in different orders, then maybe we actually have a, have a, have a decent shot at, at a, at an understanding of how do we measure what is, uh, um, progress and, and, and what is a good strategy, and we could maybe start formalizing it and actually having a, a framework. Um, maybe, um, what we need to do is actually start creating lots of
many universes, was simulations of AI solving very basic bombs, you know, in arithmetic or whatever, but, but, um, um, but coming over their own strategies for doing these things and, and having these little laboratories to test, I mean, there are people who, who, who investigate like trying to, what's the smallest, uh, you know, neural network that can do tended to multiplication and just like that.
“I think, I think we could actually learn a lot of just some, from evolving, um, uh, small AIs on,”
on, on, on, on simple problems, we could learn a lot. I was super excited when Mercury reached out about sponsoring the ARCAS, because I've been baking with them for years. I think I opened my first account with them in 2023. Something I've come to appreciate over the last three years is that Mercury is constantly updating things and adding new features. Take their newest feature insights. Insights summarizes your money
in and out, showing you your biggest transactions and calling out anything that deserves extra attention. Like, maybe you're a revenue from a particular partner has gone down, or you've got a big, unkind of rice purchase that needs to be investigated. It's a super low friction way for me to keep tabs on my business and make quick decisions. For example, I tried to invest any cash that I don't need on hand to keep running the business. With insights, we just a couple of clicks, I was
able to see exactly how much money I spent in each month of 2025. And that lets me know exactly how much cash I'll need for the next year or so of operations, and then I can go and invest the rest. Mercury just keeps adding new features like this. Go to mercury.com to check it out. Mercury is a fintech company, not an FGIC in shortback. Faking services provided through
“choice financial group and column NA members FGIC. You have to learn about new fields, not only”
very rapidly, but deeply enough to contribute to the frontier. So in some sense, you're also one of the world's greatest auto-diadacks. How does what is your process of learning about a new stuff year or then math? What does that look like? Yeah, so I certainly identify with a retort depth
and breadth before, and it's not purely human AI distinction. Humans also, as always, I think there
was a moving crew split them into hedgehogs and foxes, and a hedgehog knows one thing very, very well, and a fox knows a little bit about everything. So I definitely, I think myself as a fox. I work with hedgehogs a lot, and sometimes I can be a hedgehog if need be, but I've always had a little bit of obsessive streak. If there's something which I read about, which I feel like I
Should understand, I have the capability to understand this, but I don't unde...
There's some magic in it that someone was able to use it, type of mathematics, and I'm not familiar with, and get through that, which I would like to prove, and I can't do it by myself, but they could do it by their method. Then I wanted to find out what was their trick. It bugs me
“that they, someone else can do something which I think I can do, but I can't. So I've always had that kind of”
obsessive completionist type streak. After doing myself off computer games, because I start again, I want to play at the completionists, so both of the levels in there. So that's one way in which I learn new fields. I collaborate with a lot of people who have taught me other types of mathematics. I just make friends with other mathematician who's working on another mathematics, and I find their problems interesting, but I need to, but they have to teach me some of the basic tricks and
what's known, what's not known and I don't know a lot from that. I found that they're writing about my, or what I've learned. I have a blog where I sometimes record things that I've learned, because in the past, when I was younger, I would learn something and do his code trick, and it's okay. I'm going to remember this. In the six months later, I forgotten. I remember remembering it, but I don't, but I can't reconstruct my arguments. And it, the first few times it was so frustrating
to have understood something and then lost it. That sort of resolved, I should always write down
anything cool that I've learned. And that's this is part of how this blog came about. I wanted to tell you the right about first. It's something I often do when I don't want to do other work. There's somewhere free or poor. There's something that feels slightly unpleasant for me to do at the time. And so writing a blog feels creative and fun. Like it's something that I do for myself. So, depending on the topic, it could be a quick half an hour or several hours. But it doesn't,
because it's something that I do voluntarily. It doesn't feel like it doesn't feel time flies when I write these things. As opposed to doing something which I have to do for administrative reasons, but it's just that it's a luxury. Those are the tasks by the AI
really helping with now, exactly. Is it, like, civilization could, could from first principles,
decide how to use Terry Tau's time? You know, it's like a limited resource. What is the biggest
“death between? If the veil of ignorance got to decide how to use Terry Tau's time versus what it does now?”
Okay, this podcast wouldn't be happening. Yeah, it's actually, as much as I complain about certain tasks that I don't want to do, but I have to do it. So, as you get more senior in academia, you get more and more responsibilities, and I get some more committees, and whatever. But I have also found that a lot of events that I kind of reluctantly went to because I was obliged to for one news and all that. Because there's some comfort zone, I often find interactions with people who
I wouldn't normally talk to, I get like you, for instance. And I would learn interesting things and have interesting experiences. And I would have opportunities to then network of other
people that I would never have done before. So, I do believe a lot in serendipity. And I do
optimize my time when I, so there's some portions of my, of a day where I do schedule very carefully. But I have been willing to sort of leave some portions just, okay, I'm going to do something which is not my usual thing. And then maybe it will be always my time, but maybe I will learn something. And more often than not, it's, I feel like I've gotten a positive experience,
“which is not something I would have planned for. And yeah, so I believe a lot in serendipity.”
And maybe there's a danger actually that, you know, in the modern societies, it's not just AI, but we've become really good at optimizing everything. And maybe we are optimizing, we're not optimizing a lot of optimization. But, you know, with, with COVID, for example, we switched, like, we switched a lot to remote meetings. And so everything will schedule now. And so, we kept busy, at least in academia, you know, we met almost the same number of people that we met
in person, but everything had to be planned. You had to schedule things in advance. And what we lost out on was sort of the casual, like, knocking on a whole way, just meeting someone for, you know, while getting a coffee. And this, this, this, this, you know, serendipitous interactions that you may think are not optimal, but actually, are really important. You know, when I was a graduate student, I would go down to the library to look, if I had to look for the journal article, yeah, I'd physically
go down to the library, check out the journal and read your article. And sometimes the next article,
You know, you can just browse through.
it wasn't, but you could accidentally find interesting things, which is something which is basically
“being lost now, because you can just type in, you know, if you, if you want to access an article now,”
you just type it into, to a search engine or even an AI, and you can get instantly what you want. But you don't get, so the accidental things that you, you might have, have gotten if you done it more inefficiently. So yeah, there have been times when, I mean, I spent a year once at the Institute for Advanced Study, which is a great place to, you know, there's no distractions, you know, there's just do research. And like, the first few weeks you're there, I could just great, you're
getting all these papers written up that you've been wanting to do for a long time. You're thinking about problems for blocks of hours of a time. But I find, if I stay there for more than a several months, like, I run out of inspiration somehow, like I get bored, I keep, you know, serving in it a lot more. You actually do need a certain level of distraction in your life. It somehow adds enough randomness and, and, and, and, and temperature, high temperature, you, yeah, so yeah, I don't know
the optimal way to schedule my life. It just seems to work. I'm very curious when you expect AI's
second, like, actually do frontier math better, and at least it's good, as well as the best human
math meditations. I mean, in some ways, they're already doing frontier math, that is super-intelligent, that humans can't do, but it's a different frontier from what we used to it. I mean, you could argue that calculus were doing frontier math, that humans could not accomplish, but it wasn't, you know, number crunching in them, but, but replacing Terry Tau completely. I mean, because what do you want me for? I was just going on the five cast after. I'm not sure, we, it might not be the right
“question to ask. I think within a decade, a lot of things that math students currently do,”
well, we said a lot of the bulk of our time doing it, and a lot of stuff we put in our papers today
can be done by AI, but we will find that that actually wasn't the most important part of what we do.
You know, a hundred years ago, a lot of mathematicians were just solving differential equations like people needed, physicists needed some exact solution to some system, and they were just, they hired a mathematician at LeBours, they go through the calculus, and work out the solution to this fluid equation, or whatever. A lot of what, in the 19th century, a mathematician would do, you could make a call to a mathematical, or with my alpha, or a computer algebra package, or
now more recently, and AI, and it would just solve the problem in a few minutes. But we moved on with it, we worked on different other problems after that. Once computers came along, computers used to be human, people used to be liberalised to create log tables, and work out primes as Gauss did, and that is all we now supposed to computers, but we moved on. In genetics, to sequence the genome of a single organism, that was an entire PhD of a geneticist, so carefully separating
all the chromosomes, and whatever, and now you can just spend $1,000 in geneticist sequence, and get it down. But genetics is not dead as a subject, you move to a different scale, you know, maybe you study whole ecosystem's rather than individuals. I did your point, but on the question of, well, when is most mathematical progress, almost all mathematical progress happening, but yeah, so if you find out, oh, this year, I mean, the Millennium Prize rather than has been solved,
you would put, you know, and 95% knowledge that any I did it autonomously, surely there will be such a year. I guess, I mean, I do believe that that hybrid human plus AI's will dominate mathematics for a lot longer. It will depend, it will require some additional breakthroughs to be
“beyond what we already have, so it's going to be sarcastic. I think AI is currently a very good”
of certain things, but really tell what others, and why you can sort of add more and more frameworks on top to kind of reduce the error rates, and make them work with each other a bit more, and so forth. Feels like we don't have all the ingredients to, like, really have a truly satisfactory sort of replacement for all intellectual tasks. It is complementary, currently. It's not a placement, but maybe, I mean, because current of our AI's will accelerate science in so many ways,
Hopefully, I mean, new discoveries, new breakthroughs will happen more, more ...
it's possible that also by somehow destroying productivity, we actually inhibit certain types of
“progress. Anything is possible, really, at this point. I think the board is very, very unpredictable”
at this point in time. What is your advice to somebody who would consider a career in math or is early in a career in math, especially in light of AI progress? How should they be
thinking about the career different in the if at all as a result of the AI progress? Yeah, so we live in
a time of change. It is, as I said, we live in a particularly unpredictable era, and I think in time, like, things that we've taken for granted for centuries may not hold anymore. So,
“yeah, the way we do everything, not just mathematics or change. And, you know, so I think,”
which is, I mean, in many ways, I would prefer the much more boring quiet era, where things are much the same as they were 10 years ago, 20 years ago. But so, I think one just has to embrace that this is going to be a lot of change, and that, you know, the things that you study, some of them may become obsolete or revolutionized, but some things will be retained. And,
so, you, you, some are always have to keep an eye on their, like, yeah, there'll be a lot of
opportunities for things that you wouldn't be able to do before. So, you know, in math, you know, previously had to basically go through years and years education may math PhD before, you could contribute to the frontier of math research. But now it's quite possible as a high school level or whatever that, that you could get involved in math project and actually make a real contribution because of all these AI tools and and lean and everything else. So, there will be a lot of non-traditional
opportunities to learn. So, you need a very adaptable mindset. Yeah, a bit of the way of pursuing things that's for curiosity, for playing, playing around, and I mean, you still need to get your credentials for, I mean, for, oh, thank you for a while. We're still being important too, too. So, it still goes through traditional education and and and and learn math and science. It's a fourth, the
“or fashion way for a while. But, yeah, but you should also be open to, to very, very different ways”
of of doing science, some of which don't exist yet. Yeah, so it's, it's a scary time, but also very exciting. Yeah. Awesome. That's a good note to close on. Thanks so much. Yeah, thank you, pleasure.


