‘A.I.-Washing’ Layoffs? + Why L.L.M.s Can’t Write Well + Tokenmaxxing

Hi, I'm Solana Pine, I'm the director of video at the New York Times.

For years, my team has made videos that bring you closer to big news moments.

“Videos by times journalists that have the expertise to help you understand what's going on.”

Now, we're bringing those videos to you in the Watch tab in the New York Times app. It's a dedicated video feed where you know you can trust what you're seeing. All the videos there are free for anyone to watch, you don't have to be a subscriber. Unload the New York Times app to start watching. I just read the most heartwarming news this morning that I wanted to share with you Kevin.

What's that? The UK government has withdrawn a proposal to let AI companies train on a copy-rided works after a backlash from artists like Doa Leapah. Did you see this? No.

Doa Leapah said, "Don't start now with this AI." My sugar boo? She litigating Kevin. She's making some new rules. And she's saying, "We are not going to train on my copy-rided works." Wow. And that's why she is a queen.

And so Doa Leapah, if you're listening, we salute you. Yeah. Doa Leapah, you're a Doa Leapah. Period. Doa Leapah said artists' rights. Kevin Ruse of Tech Column set the New York Times. I'm Casey Nude from platformer.

And this is hard for this week. A big wave of tech layoffs is raising the question. Has AI job lost truly begun? Then, writer Jasmine Sun is here to help us answer the question. Where chat bots bad at writing?

And finally, it's token-maxing time.

Why tech companies are building leader boards to measure who is spending the most on AI. Well, Casey, for years now, we've been monitoring for signs of an AI job apocalypse. Yeah, we've been monitoring the situation. It's true.

“And over the past few weeks, I think we've gotten some early indications”

that something is happening in the labor market, especially for tech workers. Yeah, we have certainly heard CEOs of companies announcing layoffs in Voking AI as a reason that it is happening. And so that has gotten our attention. Yeah, so just a couple examples from the last few weeks.

Last week at Lassian announced a 10% reduction in its staff about 1600 jobs that they said we're going to help them fund further investment in AI and enterprise sales. That came on the heels of a big round of layoffs at block the financial tech company formerly known as Square, which said that it was cutting its staff by about 40 percent or about 4,000 jobs saying that they were shifting the way that they were working

to use smaller and flatter teams. And then the big one that folks are expecting maybe as soon as this week is that meta is reportedly poised to layoff 20% or more of the entire company. This was reported by Reuters last Friday who said that their sources had told them that meta was preparing to cut as many as 16,000 jobs.

The largest layoffs at that company since late 22 or early 2023 when they laid off 20,000 people. So as of this recording, that hasn't happened yet that we know of, but I know that people at meta are very on edge and are awaiting the further news about their jobs. meta after this story came out told Reuters that it was quote speculative reporting, which if you're not familiar with the language deployed by meta communications

staffers means this is happening, but we don't want to tell you it's happening yet. Correct.

So Casey, I want to hear what you make of these layoffs, but first we should do our

disclosures. I work for the New York Times, which is suing open AI Microsoft and Proplexity. And my Beyoncé works in anthropic. So, okay, Casey, what do you make of the fact that all these companies are referencing AI in some way as a reason for their layoffs?

“Well, I think it's a little different at each company Kevin.”

And I think we can make a decent case for and against the idea that AI is really driving the show at each of them. So maybe we should get into that. But at a highest level, I would say companies do continue to tell us now that AI is a significant factor in the reduction of these workforces.

And sooner or later, I do think we're going to have to believe them. Yeah, I think this is the early warning sign for a lot of people, especially in the

tech industry who are, I think it's fair to say, going to be some of the first people

to see their jobs change or disappear because of these new AI tools. But let's get into some of the specifics here. So, Casey, let's start with Atlassia and the first company I mentioned. Their CEO might can and Brooks said in a company blog post that the bar for what great looks like for software companies on growth, on profitability, on speed, on

value creation, has gone up. He said we are choosing to adapt thoughtfully

Decisively and quickly to drive durable, profitable growth.

He claimed that AI was not replacing people, but he said it would be disingenuous to pretend that AI doesn't change the mix of skills we need or the number of roles required in certain areas. Yeah, so I take him at his word.

“It seems like he himself is trying to walk a middle path there, right?”

And sort of not denying that AI is a factor here, but also not saying like this is the only reason this is happening. I think some other context that is worth having is that Atlassian is one of the companies that could be part of what we've been calling the SaaS apocalypse around here, right? This is a company that makes tools for businesses.

A lot of its products are essentially structured workflows. And there are those who believe that sooner or later you're just going to be able to code your own pretty cheaply. Now, maybe you will still choose to buy a product from a company like Atlassian, but maybe you're not going to be willing to pay nearly as much as you would before.

And so the company's stock price has just been battered over the past year. And I think that has left them one hurting for cash a little bit. But, too, and probably more importantly, looking for a different story that they can tell the stock market about what they're doing. And so today that story is we're going to get rid of some of these workers and

we're going to figure out how to make our remaining workers more productive.

So there's this term that's been floating around called AIwashing, which is basically

when a company wants to lay a bunch of people off. Or maybe they don't feel like they need as many people. And I thought it was when a software engineer finally took a shower. And basically the thesis is like, these aren't really layoffs about AI. This is just sort of a convenient excuse that these companies are using.

Do you think Atlassian qualifies as AIwashing? I would like to get a little bit more detail on exactly who they are laying off here, which is a detail that we do have about some of these other companies that helps us answer that question. So I don't know exactly how it is happening inside of Atlassian. But I think that their CEO was relatively straightforward as these things go and

saying like, it's a little bit about AI. It's not entirely about AI, but like, yes, keep your eye on AI. So to me, that just reads as honest. And so I'm going to give them a pass. Okay.

Let's talk about block Jack Dorsey. The CEO of block gave an explanation about their layoffs. He said, quote, we're not making this decision because we're in trouble. Our business is strong, but something has changed. I had two options cut gradually over months or years as the shift plays out.

Or be honest about where we are and act on it now. I chose the latter. Casey, your take. So something to know about me and Jack Dorsey is I have a bit of a bias against him. As a former Twitter user who misses that website dearly at this point in 2026,

I would not hire Jack Dorsey to run a lemonade stand. Okay.

“But if you want to talk about block specifically, this is a company that tripled its”

head count from about 3800 people in 2019. In what seems like just kind of classic like in attention to what was happening in the business during pandemic era boom times, right? And I wonder if you saw this detail because it truly took me out Kevin five months before they announced the layoffs.

Block spent $68 million to fly 8,000 people to an in-person event with Jay Z.

Come on. Yeah. So that's the kind of famous attention to detail that has turned Jack Dorsey into one of the greatest visionaries in tech. So look, is this about AI again, you know, what does block really do?

They have those little iPads at the coffee shop and then they have cash app. Okay. How many people do you really need to run those products? Probably fewer than 10,000. Is that about AI? Uh, I don't know.

Maybe if you squint. But again, this is a company whose stock price was cratering. They needed a different story to tell the market. And I do think you can make a case that AI will make the remaining workers more productive. So again, this is another one where it's like you could use AI to justify what's happening.

But you also could just say this company has been mismanaged for a while now. Yeah, you could use AI washing or jazzy washing. Which seems to be what they, what they are doing here. So this did seem to have an effect on their stock price. In fact, the day after Jack Dorsey announced the layoffs.

Block stock shot up 17 percent.

It's gone down a little bit since then. But they're still up from where they were before these layoffs.

“And I think we should just say like this is also a part of the equation here, right?”

These are companies largely public ones that have investors attention. And right now, there's sort of this narrative power around AI. Where if you seem like a company that is investing heavily in the AI tools and the AI way of working, your investors say, Oh, that company is really forward looking. They must have a plan for how to navigate this transition.

And so I think there's sort of they're seeing the power in telling the story. That all this is related to AI. Yeah, which by the way reminds me of like the peak of crypto mania, when like some public traded companies would just add like a crypto term to their name.

Their stock price would shoot up by like 40,000 percent.

Yes. It turns out that the public markets actually can just be tricked that easily. Yes. That would give me some relief if I was a CEO just knowing that I could fool people like that.

But anyways, so let's talk about the third large tech company that is reportedly conducting layoffs.

Meta, we don't know exactly who or what teams are being affected by these layoffs. But this is a significant part of their workforce and they seem to be saying in their communications with the public. What all of these other companies are saying, which is we are going all in on the new way of working. And we are going to have to make some cuts to make that work. Yeah, on a recent earnings call Mark Zuckerberg said that quote projects that used to require big teams.

Now, can be accomplished by a single very talented person.

“And we should also say that this cut is coming alongside this massive AI infrastructure investment, right?”

They're going to spend $135 billion on capital expenditures this year.

And even for a company of meta size, like that is real money, right? So I know that they're trying to be careful, again, trying to not spook the stock markets too much. This is obviously the biggest bet in the company's history. And I think that making some substantial cuts are going to signal to the market, like, hey, don't worry. We're not like completely losing our minds here.

Like we're going to keep some of these expenses under control.

“Yeah, I think that's a really important point because what we're seeing here at some of these companies is that they are not actually sort of cutting costs in the aggregate by using these tools.”

They are just shifting the cost from human labor to AI. Right? They are plowing this money that they are going to save by laying off these thousands of people into the building of data centers and other AI infrastructure.

And basically the bet they're making is these new AI workers are going to be faster, more efficient, maybe cheaper in the long run, maybe not.

But they are going to be able to do the work that used to require many thousands of people. And that is a profound shift in the way that companies are talking about their workers. I recently talked to a venture capitalist who said that a lot of the AI startups that he sees the most AI native companies are spending more on AI tools than they are in payroll. And that may be an outlier, but I think that is sort of where these companies believe that we are headed, where the majority of your expenses will not go to paying the salaries of human workers.

It will go toward buying the AI tools and the tokens that your company runs on. Yes, I think that's absolutely the bet that they're making. I also just think it is worth noting that this is still purely mostly speculative, right? Like in the case of meta specifically, this is a company that is arguably been struggling when it comes to AI. They had to abandon their last model behemoth because it wasn't very good. The Times reported last week that it's delaying the release of its latest model avocado because it hasn't been hitting its performance targets.

Apparently barely out performed Gemini 2.5. What is this last March? Yeah, that model is really the pits. That's an avocado joke. That's very good. Thank you. So again, this is not as simple as saying they're able to cut 20% of their workforce because they've just made these massive gains. I'm sure there are individuals that who have made massive gains, but as a company, it still seems like it is somewhat mired in dysfunction. They just did yet another partial re-org of their AI teams and that just always sort of makes me raise my eyebrows.

Yeah, I will say like one thing that's been surprising to me about this recent round of layoffs is that the companies that are making them are not the ones on the front here, right? It is not the open AI is the Anthropics, the Google's. Those companies are not laying off people on mass because of these AI tools, which they are building and presumably have even better models than the ones they're releasing to the public.

“So you have to think that part of this is just companies that are sort of lagging behind their competition saying, well, maybe if we just use a bunch of AI, it'll help us catch up.”

Yes, but also like open AI and Anthropics are much smaller companies than some of the ones that we've been talking about today at least a number of workers, right? Like I think it is interesting to think that Atlassian is like bigger than open AI in terms of the number of people who work there when you look at the relative like value of what they're generating. DocuSign has 7,000 employees. There's no funny or sentence that is true. I somebody who has a hidden subscription for DocuSign that I truly resent paying for.

Get to work over there, people. Or get not to work. Here's another question that I would ask Kevin. Okay, so we're seeing a bunch of layoffs like are these AI related or not? Does it actually matter if the effect on workers is the same, right? Like, you know, if you're the worker, like whether it's about AI or not, you're still out of a job. Yeah, and it's not clear to me what workers can or should be doing to sort of protect themselves against these layoffs. One person I talked to said, you know, they're, they work at one of these big tech companies and they're like, well, there's just a lot of joceling and fear and anxiety right now.

People don't know if they should be like using the AI tools a ton because the...

Like, I think there's a lot of fear and suspicion and mistrust inside these companies right now. And for good reason, there are executives are planning to lay them off.

Yes, and by the way, I think at least some of these companies, that is maybe not an explicit reason for these layoffs, but some of the executives there would see that as a positive byproduct, right? Because, you know, if you're like Mark Zuckerberg, you live through the 2020 era.

“You had these restive employees that like wanted a lot of things from you and they wanted to have a lot of control over what the company could and could not do and how it did it.”

And, you know, I just know that executives over there really resented that sort of thing. And once a meta entered this new era of massive layoffs employees over there did get really scared for all of the reasons that you would assume they're like, oh, god, like, you know, maybe I actually, I'm going to lose my job. And all of a sudden, they got a lot more quiet, and you started to see a lot fewer protests over there. So, I'm not going to say that like these occasional mass layoffs are a way of like keeping the workforce in line, but I have noticed that it seems to be having that effect.

Totally. And it makes me wonder whether something that I predicted was going to happen, you know, year two ago that did not happen, which is the sort of sudden and mass unionization of workers at these companies may actually start to happen in the next year or two. I think one major difference between what's happening now at these tech companies and what has been happening for decades at manufacturing companies, car companies, you know, factory workers, is that those workers were by and large unionized.

And so when the employers said, hey, we're going to lay a bunch of you off, they were able to negotiate, they were able to say, hey, maybe instead of laying us all off, maybe you could find other jobs for us, if our jobs are being automated, maybe we should be allowed to sort of retrain to do something else.

And that was largely successful. There were still layoffs, of course, but not the number that we're seeing today at these tech companies.

So, do you think there's any possibility of that, or is that just sort of a union fever dream? Here's what I will say. I cannot think of anything that would make Mark Zuckerberg more mad than a union of software engineers at Madda, and I think the software engineers at Madda should use that information how they will. You think that would make it more mad than getting booted at a UFC fight? Absolutely. I think that really doesn't make it really sad.

“Well, there you have it. If you want to make Mark Zuckerberg mad, that in place, sign your union card.”

Well, we come back. Why aren't chat boxes good at writing as I am? Last Jasmine Sun. In theory, I knew that this kind of thing can happen in any family. Anyone's first cousin could be bloody murder. This is UC4735, and today is...

Upstanding citizens are always turning out to be secret criminals.

It's a great morning with Alan Gesson. And I wouldn't even call my cousin Alan, an upstanding citizen. You know, my clients are in cartel level guys. They're all bad asses. They're they... Alan, murder me. It's another being so much worse than I thought I knew.

The price is definitely reasonable. Okay, but what the hell was Alan thinking? Like, let's just say that I'm willing to do stuff. Yeah, yeah, yeah, I get it. From serial productions and the New York Times.

I'm Em Gesson, and this is the idiot. Listen, wherever you get your podcasts. Okay, see, over the last couple of years, we've talked on this show about how AI models are getting better. At so many things, they are getting better at coding,

at competition math, at solving novel physics problems, math, domestic surveillance, autonomous weapons.

“And I think the story of the last few years in AI has been one of sort of rapid steady progress.”

But these systems are still sort of jagged and they have flaws and weaknesses. And one place where they arguably haven't improved that much is in writing. Now, that's our domain. Yes. At least that is the argument that Jasmine Sun made in the Atlantic this week.

She is a freelance journalist. Her piece was called the human skill that alludes AI. And it's her attempt to understand why despite so much progress in all these different areas. The models of today don't seem to be writing anything particularly good or compelling. And while I think the question of our LLM is good at writing is highly subjective and dependent on the use case.

I do think Jasmine makes a really interesting technical case for why these mo...

Yes, and we should say before we bring her in, Jasmine is a friend of mine.

She has also been my researcher on the upcoming book that I'm working on. And I just think she's like one of the best people writing about AI today. She writes on her sub stack, which is called Jasmine news. It's JASMI.News. And you can read much more of her writing there.

All right, I'll allow it. But I do want to bounce it out by next week bringing on one of your enemies. Okay, let's bring her in. Jasmine Sun, welcome to hard fork. Thanks for having me. I'm excited.

Hi Jasmine. You wrote this great piece in the Atlantic this week about the human skill that alludes AI. And I want to start by challenging the subtitle of your piece. Why can't language models write well? Can't language models write well?

So I do say in the piece that most writing period is very bad.

“And so I think that language models are definitely better at writing and language than most humans are.”

But the question that I was really curious about is why can't they write a sort of literary, creative fiction level because the thing is if you listen to these AI leaders talk about their aspirations. They say we're going to cure cancer. We're going to solve physics. We're going to build a superhuman coder.

They are not shy about, oh, our AI models are going to be better than 75% of human coders. They're saying, no, we will literally build a self-replicating factory tomorrow. And then Tyler Cohen asked Simulman in an interview from Last October. When do you think GBT will be able to write in a root of poem? And Simulman says maybe in the future,

Chatchee BT will be able to write, quote, a real poet's OK poem. So that was the thing that fascinated me is even these guys who are more bullish than anybody else about the capabilities of their technology. They are very reserved about how much literary writing their models can do. And so that was the gap that I was really interested in. And you start your piece with this interesting provocation, which is that in some ways.

GPT-2 was the peak of AI when it comes to creative writing. So explain that. Part of what got me interested in this piece was I was actually doing research for your book. And I was going through all of these previous generations of models and reading the outputs. And the thing that really shocked me is that like in a way,

the writing style of GPT-2 and GPT-3, I found so much more compelling than Chatchee BT today. It doesn't have any of the annoying ticks. It doesn't have the M-dashes, the tri-partite list. It's not this, but that the tone was meant for a variable. Like it would actually surprise you, it would be funny, it would be poetic.

And that shocked me to sort of like go back a few generations and realize that maybe, you know, they were also lying all the time and all sorts of other things. But from a writing style perspective, I kind of preferred it. And I wanted to invest in that. That shocked me, me talking to GPT-2 was like talking to somebody who would just follow down the stairs.

You know what I mean?

“I think what do we need to get you to the hospital?”

You smell toast.

Yeah, there's, there are these amazing prompts for this like early opening.

I prompt library where they would say like, you know, I just $175,000 in Las Vegas. What do I need to know about taxes? And GPT-2 would say like start just writing some short story about like an orphanage. It was like, they were like crazy.

They were like, they were weird. They would absolutely be a terrible corporate assistant horrible like coding in her. It can't do any of the things that modern elements can do that I'm very grateful for. But like from pure writing style perspective, they're very good. So GPT-3 in particular, like they,

there's, I found this like set of samples that some guided were like, oh, right in the style, Paul Graham, right in the style of Richard Dawkins, whatever. And it could style match much better than modern elements can. And particularly because so much of sort of literary writing comes from voice and style, that was one of the things I was really interested in.

And it's like, what did we lose that the elements can no longer emulate, Paul Graham style or whoever's style? Because I would put in the same exact prompt that this guy gave GPT-3, put it into chat GPT-5.4 thinking or whatever, and it would be God awful.

And it was like, that's really weird. So tell us about what you learned about what happened after the GPT-2 and 3 era that changed the way that these models respond to us.

“Yeah, I mean, I think the answer is post-training, basically.”

So they started adding a post-training layer, which is basically saying,

we have these like crazy unpredictable, like nut job, concussed models, and they need to learn how to behave. Because a model that can behave is a very bad corporate assistant. And so the AI researchers give them example dialogues and scripts to learn from. They give them words that they can't say.

They do RLHF, which is a process by which human greater is, will reach, like, which response is the most helpful sounding or something like this. And so now these post-trained models have been trapped in a way or trained or guided

Towards a very particular character or persona that is a very helpful assistant,

but might be very bad at writing in creative and surprising ways.

I mean, the way that you described it was that there is a phase within the post-training phase where these AI models are evaluated by humans. And that's part of what they call RLHF, or reinforcement learning from human feedback. And what struck me in your reporting is that you actually talked to some people who have done this kind of feedback to the models, who say that they're just being asked to grade things

that in ways that don't make sense. Right, tell us about that. Yeah, I mean, this is super interesting because like these jobless things you'll see on, like, places like Mirkoor or XAI, Elon's company will listen directly. It'll be like, creative writing expert, $45 an hour, must be a New York Times best seller and have, like, a start-kirkist review or something like this.

Have you ever become a start-kirkist reviewer? I think so. Okay, good job. All right. You might qualify to help Annie from Grog, right, a little bit better. Yeah, we're going to get in that jobless team. But okay, you were saying.

Yeah, so anyways, so these companies, because they realize that these AI researchers

are really good at knowing what good coding is, but they don't actually know what good writing is.

So they're like, why don't we hire some humans to find out? And so a little commission, like, MFA's and published authors, and sometimes just like random guys with a blog or whatever. And one of the people I talked to who was a contractor for Scale AI is a writing evaluator. And he was doing this for one of the bigger labs.

He said that the rubric just didn't make any sense.

“He would be told things like, you have to grade them based on the number of”

exclamation marks that there are. And so if something has three exclamation marks, that's too many. And so you have to ding that one. Yeah, and I have to say generally not bad writing advice. I mean, I guess it depends on the length of the text,

but three feels like a lot for many scenarios. I think this is what they tell women in business communications. It's like take all those exclamation marks or place them with periods. Like we are going to remove all of the ideas. We teach women to shrink themselves.

Exactly. But yeah, so he was sort of like being asked to grade these things or another one was he got a bunch of fan fictions and he was supposed to grade them on their factuality. Since that was one of the criteria.

I do imagine that one could, you know, devise better rubrics than this particular evaluator was given. But I think it does show at least that some of these like very big companies that are very well resource simply do not know how to think about what good writing is.

Briefly like I want to underline that because to me that seems like the whole story. We are taking the entire internet and we are grading it on factuality. And like so the LLM that you're going to get out of that is just probably not going to be all that creative. Well, and I wonder how much of it is related to this sort of verifiable reward

system that a lot of these companies are using where you have a system generated a bunch of code and then you have another evaluator model. Check the code to see whether it's good or not. And that works in domains like programming where the code either runs or it doesn't, but creative writing doesn't work that way.

You can't have an evaluator tell you, you know, with any sort of consistency whether something is good or not. And so it may just come down to preference. And so I guess I'm curious, do you see this as a technical problem that the labs are frustrated trying to solve?

Or is this just demand related?

“Is this just what people want chat bots to sound like?”

And in every test where they pit different models against one or another, the one that sounds like a bland corporate assistant wins. And so they go with that. I think both are true. It's like the majority of writing that we are asking the models to do is

write this email for me, right? And like they excel at that. They are truly great corporate email writers. They are much better at the whole like passive agressive thing than I am. At the same time, I do think like you said,

there is a technical challenge that has to do largely with verifiability. It's like there are people who have spent decades of their lives attempting to articulate what makes Shakespeare Shakespeare or what makes a neurodipoman, neurodipoman, and they will still not know in any kind of certain way.

They will still get in two debates with their fellow academics and literary critics about which writer is better than the other. And because these things are subjective because they are ineffable because they are hard to put in a rubric. And that is the nature of art.

And to that point, you know, you started this segment by talking about Sam Altman saying like, hey, you know,

we just basically can't write a great poem yet.

Sam Altman a year ago said the company had trained a good creative writing model and posted a short story on acts. Many people found it compelling. If Sam Altman just not being consistently candid with

us, Jasmine. Oh, wouldn't be the first time. But that short story.

“If you remember, I'm sure you guys are called had some great lines”

like talking about the seams of mirrors or Thursday. What was it? It was like the liminal almost Friday. I have to actually look this one up because it was so good. Well, you know, when the thing about AI writing is like it

comes up with all of these fun metaphors and they are like kind of surprising sometimes with the metaphors. But also the language is not grounded in the life.

That was my other thing is aside from the verifiability

fundamentally when I think about the writers who I really love.

“When I think about whether it's journalist or poets or whatever.”

Like they are writing from life, right? Like a journalist goes out and talks to people. And they like see stuff and observe like the color of the sky in a particular way. Or like a poet is thinking about personal experiences that they've had.

They're writing has stakes. It comes from an emotional place. And the fact that like LLMs are being very talented, grammatically pristine, whatever. They don't have lives.

That means that all of the metaphors they choose, all of the words they choose, the examples they choose. They're just ungrounded, right? Like it's not coming from a point of view or a particular experience or a particular community that makes the writing believable.

I think part of a voice in style is that it is very specific to the life that a person has had. And LLMs cannot get there in the same way a human who hasn't really lived that life. Like cannot get there.

I don't know. I feel like it's case dependent. You know, I'm a big music fan. Yeah. And over the past few months,

I have enjoyed putting questions about music. And in particular, the sounds of certain bands to an LLM, which sounds like a joke prompt.

Because an LLM has never heard anything, right?

And yet I find that in general, the models can have good conversations with me about the sound of music. Now, it may be that they are just pattern matching based on a bunch of public writing on the internet by people who do have ears. And have heard, right?

Like I'm very open to that. But I have just sort of been struck about the way that it is able to write about sensory topics in an evocative way that, at least to me, like surpasses what I would predict they would be able to do. Yeah.

“I want to pose a couple of objections that I think someone might make to your article.”

One of them is this is Cope. This is Jasmine, a writer, a very talented writer. Sort of finding the things that AI in her view is not good at yet and saying, this is categorical proof that they will be very hard for AI to do these things. This is the same reaction that software engineers had when models started getting really good at code.

They would say, oh, well, I can't do these other ten things that I do.

And that basically, just wait a few years and the models will be better than all of us at everything, including writing.

I would love for it to be Cope because I try to automate myself away all the time. I have no sort of deep attachment to like, like I like writing, but like I have tried over and over and over for the past three years to automate my own job away to get clawed to do my job for me. It cannot do it.

This is very frustrating. It's not out of a lack of trying. And again, I'm going back to the CEOs themselves and the things that they themselves are saying, right, like it's not just me a writer. It's Sam Altman saying this thing will cure cancer and solve physics,

but it will not write better than it real poets okay poem.

“And so like I think like that suggests that there is something that is at least perceived as a little bit different.”

I think it's very possible the models will get much better at writing over the next few years.

I don't think it's like a never thing.

I do think that you know like reporting is hard to replicate. I think that like having life experiences that are real and verifiable is hard to replicate. I think the style stuff can be improved especially if you find tune the models. But I think what's also interesting to me about this piece is that it shows how the market incentives, the demand incentives of these companies do shape what we see their abilities are today. The other objection I imagine people might have who are very AI-pilled is that this is all in the eye of the beholder, right?

There have been several studies now that have shown that if you give people a blind taste test of AI writing versus human writing, they prefer the AI writing until you tell them that it's AI writing and then the value in their eyes plummet. I did one of these in a New York Times quiz just recently. So is it possible that the models have already become superhuman at writing, but that the minute we learn that they are AI models generating text and not humans writing words with their fingers, we lose all interest in it just because of the source, not because of the quality of the writing.

I mean, I think it's definitely interesting and true that people don't want to like AI writing, and that is part of what bothers them when they see AI texts that is obviously AI even though like you said, like in these quizzes and tests AI cannot perform human writers in those narrows scenarios. I mean, my quibble with a lot of these quizzes and tests is that like as a writer and you guys are writers too, how much of your job is actually text generation?

I think AI is a superhuman text generator, right? My job, I am generating text probably 25% of the hours in my day. I spend a lot of time interviewing people. I spend a lot of time coming up with ideas. I spend a lot of time reading and not just reading and discriminately, but reading is very particular sources that feel like the right ones. And so like, you know, usually at the point that you are doing one of these tests,

you're saying like generate like one paragraph very specifically about like why Trump won the 2016 election,

500 words or less, and like you've already given the prompt which I think is ...

You've often like supplied some of the evidence and the guidance and the form of it saying like 500 words or less. And at that point, I do think that AI is probably a better text generator than almost little humans are.

“But again, when I think about it, you know, AI is still very bad at coming up with ideas for articles.”

It is still very bad at reporting the non-text generation parts of the role feel further away from automation.

Again, like I'm not a never say, like I'm sort of like never say never like maybe you'll get there.

I would be totally happy if you know, cloud was able to give me good ideas for my next essay, but it's not there yet. Well, we're already seeing the LLMs make huge progress in genre fiction, right? So like recently on the show, we talked to the author of a story in the Times about how authors of romance novels are now able to generate dozens of novels a year using LLMs. In fact, much of the discussion that we had was around how you just have to prompt them differently and sort of relentlessly in order to get what you want.

You know, your piece Jasmine made me wonder like how much of getting a model to just write a weird can be achieved by repeatedly telling it in different ways.

Hey, be a little weirder.

To some of it, but not all of it. I mean, so I talked to for example, James you, who is the co-founder of Sudarite, which is one of the earliest creative fiction AI writing assistants. I talked to some other folks who similarly were in the fiction writing LLMs space. And like you said, to an extent, a lot of writers are already using these a lot already leaning on LLMs to generate large amounts of text and it can be very successful and I can meet readers needs and whatever. But like even these people who I was talking to do, they were describing to me how freaking hard it is to undo all of the post training that the labs have done.

So they are like applying immense amounts of engineering effort that clearly when in my conversations with them, clearly frustrates them that it is so hard to get these models to stop being. So turkeys is synthetic so PG13 and everything in order to get them to this sort of like base model state where they're able to be weird again. So I think it's certainly possible, but I think the labs have made it quite challenging just because of the way that these models are trained.

“The other thing that I think is important is I tend to think that writing and a lot of creative work is actually like the perfect use case for these center models, right?”

The idea that the human plus AI collaboration is where you can get the furthest and when I listen to the interviews that you guys did about the fiction authors, I was thinking this is a center model, right? Because without the human prompting and bullying the AI into getting weird and getting central and whatever, like it was not going to do that on its own. And like I myself, like I do use LLMs as a research assistant, like I wrote about that inside the Atlantic piece about the way that Claude has now sort of helped me edit my own work in a way that I found incredibly useful.

But I do feel like the collaborative element is important for any domain where the personal perspective lived experience whatever really matters. Talk about that a little bit. You mentioned your editing process. How are you using AI to help you edit your work and are you finding it useful? Yeah, so I feel like I really cracked this over the last like couple months, which I'm very excited about because again, I've tried to make these things like right in edit for me over and over and over and they've never really been able to do it.

So the thing that I realized was, if I make Claude into an editor that is not just trying to grade and give feedback on my work, again, some generous I standard of what good writing is.

But actually, what we did against basically what my personal, Jasmine's personal aspirations for writing are it can give feedback that I find much much more helpful.

So what I did was basically I fed Claude, my entire subset archive of the writing that I've previously done as well as some of my freelance work. And just to get real specific, is this inside like a Claude project or how have you set this up? Because I know, like our listeners are going to want to try this. Yes, I did it in a project on Claude's advice. I was like, do I need to Claude code something? Claude was like, no, that's overkill. So you don't need to code or anything. So in a Claude project, I gave it my whole archive of writing.

I also personally write retro notes to myself after everything I published. So I have a notes up that's just like me writing was good and bad about everything I've ever written. Just a few bullet points. This is why Jasmine's going to be our boss.

“I also gave it that because I wanted to learn my taste. I wanted to learn what do I aspire to be and where do I see myself falling short and where what am I proud of, right?”

And so from those two things, plus a little bit more information about like, here's my audience. This is my beat. This is my goals. We were able to code develop a rubric of instead of like how many exclamation marks does it have. It would say things like, does this take advantage of your quote unquote like insider anthropologist position in Silicon Valley? Because that's one of the things that Claude and I think distinguish my voice or it will also notice like, oh, Jasmine, you tend to move between registers. You will switch between, you know, start up jargon and like internet slang and whatever.

Like I think the fact that you can do the high low or move from like policy to personal scene, this is something that is characteristic of your writing. And so again, we're code developing these qualitative criteria and then I split it into phases of like ideation phase rubric structure rubric prose rubric final fact checking.

So what I do now, I put this on a Claude project.

I dumped a draft into Claude, Claude will run like phase two structure on it. It'll say things like your conclusion is just a summary and this is really boring.

In fact, in your piece about this and that you actually ended on a scene and I thought that was much more powerful.

“So why don't you try ending this one on a scene and Claude will say rather than inventing a scene, it will say, what were you thinking when the plane took off? What were you feeling inside?”

Can you think of a scenario where you had a conversation with say a kid safety advocate about AI that really resonated with you because right now it sounds like dry policy explainer. And like that feedback, I actually found incredibly useful. Like I'm still applying my own judgment to say do I take it or not, but I'm like, you know, this is about me becoming the best version of myself as a writer. It's about like me self improving and Claude pushing me to do that, which I found much much more helpful.

I want to ask you both a question as fellow writers. Do you feel the impulse to make you writing weirder because of AI to sort of stand out from the sea of slop?

Because I find myself feeling this tug of like, oh, that's a little weird aside that probably I should cut, but I think I'm going to leave it in because like Claude would never do that.

It's like a marker that I am typing these words and I feel like that's sort of my imprimatur that I'm leaving. My answer to you is yes, I absolutely feel that way and I've like gone back and tried to edit sentences to like make them feel a little bit more like weird or like in particular to make them sound colloquial in a way that I know like an LLM generally would not be. It is for that reason I think that writing right now like we're all that all many of us are on such high alert for the prospect that we might be reading Slop that I think if you're a writer who does not want to be producing Slop like you should be asking yourself that question.

I think it had makes me a lot more comfortable writing the way I want to write in the first place like I think like maybe unlike both of you I didn't sort of come up through newsrooms where I was like learning a very specific how style and all of these norms like I can do news writing now it's something I've learned now. But like I'm actually much more quote unquote like internet and blogging native which is a form that is voicy and a reverend and not as proceed and like will make inappropriate jokes and like you know which is a looser form of writing and so I think what it's actually done is maybe more comfortable doing the blogging thing instead of sort of always trying to write in a more professionalized journalistic tone.

“So I think we should leave this with a question for you Jasmine which is you know your piece makes the case very convincingly that today's AI's are not very good at the kind of writing that I think we all value.”

Do you think they will get there and what should the companies do to make their models better at writing.

I think that if we separate out text generation from reporting which I'm not that bullish on the models doing and we're just talking about say literally fiction or here's a bunch of interview transcripts write a magazine feature or something. I think that if they applied as many resources towards that task as they do towards coding agents and things that actually make the money I think that they could get there. The companies ever find it financially advisable to spend all their resources on that instead of automating 23 year old software engineers probably not.

I would be grateful for that world. I don't need them to take my job with these folks's jobs, but I think it's possible.

“Look they're going to get around to an eventually okay you know it's like I mean I I was seeing what writers make in this economy can see.”

Essentially like I was not going to pay for a lot of data centers know that there is economic value in writing like and eventually the AI companies will want that all to themselves. You know it would be a very funny outcome of this you know it taking your your point about this sort of guardrails of the models. Maybe the great the next great American novel will be written by Grock. Oh God. And with that Jasmine Sun thank you for joining us.

Thank you very much. When we come back everyone spending money on tokens Kevin great keep going you've got to be token token maxing that is keep going. Yes, and when we come back what are you talking about that's the question being asked by a leaderboard that's sweeping silicon valley a sweeping that's really sweeping. She. Well Kevin you've recently returned from book leave and are once again writing in the New York Times.

How does it feel to see your name in print again feels great hasn't happened yet, but when it does it'll be great. Well, I got to take an early read at a story that you are publishing about the fact that tech companies have now created leaderboards to show which employees are using the most AI tokens in their work.

Yes, it's a token frenzy out there and the employees of these companies are c...

They want to be the people at their company who are using the most AI tokens.

“So let me just ask a basic question for listeners who may not be familiar what is a token and why is that something you might start keeping track of.”

So a token is the basic atomic unit of AI labor it's basically a fragment of a word and it is how AI model providers measure their consumption.

So if you type in a prompt you know help me write this essay an old model might have given you a couple hundred tokens in response that would be a couple hundred words. And what has been happening over the past year or so as these agentic coding tools have started taking off is that the models are just much more token hungry you can use now hundreds of thousands or even millions of tokens in a single session. And so that is what is propelling these leaderboards is the idea that the more sort of coding you're doing the more agentic tools you're using the more simultaneous processes you're running.

The higher your token count will be one measurement I found useful was that apparently it takes about ten thousand tokens to generate seventy five hundred words if that sort of you know helps to ground you at all. But as you just said and I want to hear more about this the more advanced systems are using way more tokens than that so tell me about some of the numbers that some of the sort of token all stars are putting up on the on the boards. So I don't know all of the exact numbers but I did learn that an open AI where they do track this kind of leaderboard the highest employee token count over a seven day period recently was a guy who used two hundred and ten billion tokens.

And this is for rough scale about thirty three Wikipedia's worth of text and now all that is not sort of typing and receiving a response some of that is what they call cached tokens.

So it's not all sort of you know being extruded from the model for the first time.

“But these are the kinds of numbers that I think even a year ago would have sounded completely insane. Now was this guy working on a new mass domestic surveillance program for the department of defense.”

I don't know, and open AI did not make him available for interviews, but what I wanted to do in writing this column was to try to call up a bunch of people are talk to a bunch of people who are in this sort of billion token club right that this sort of extreme power users. And just ask them like hey, how are you guys using all those tokens and isn't that very expensive and how are you paying for it all and I learned a lot. Yeah, well, okay, well, so tell us first of all just how expensive it is very expensive.

Yeah, in fact, I heard that the top user of cloud code, the top individual user of cloud code as measured by anthropic spent more than a hundred and fifty thousand dollars on tokens last month.

So extrapolate that that is like an employee making more than a million dollars a year.

Yeah, and they are burning that in a month, and I heard similar figures from some of these other extreme coders who are spending something on the order of thousands of dollars a day on tokens from these models. Now, we should also say the employees of these companies get their tokens for free, right? So they're not selling out their companies are not selling out, but at other companies, this is starting to become an issue because they are sort of outstripping their budgets for these things. So there are companies where there are engineers who legitimately are costing their employers maybe a hundred and fifty thousand dollars a week because they're getting tokens from one of the big providers.

I talked to a software engineer in Sweden who said that he probably spends more than his salary on cloud. So this is essentially becoming like a very expensive job perk for some of these coders. So talk to me about why employers want to create leader boards to promote this to employees because I could see other companies saying if you spend a hundred and fifty thousand dollars on tokens last month, you actually don't work at this company anymore.

“So this was a big question that I had is like, why is this going on? And it seems to be some combination of sort of employee motivation and worker tracking, right?”

There are executives at these companies who think that the more tokens you use, the more productive you probably are. And as we discussed in a previous segment on this show, these companies are very eager to have their workers start embracing the AI tools.

So at a number of these companies, I talked to people who said, yeah, this is...

I realize you probably haven't dug deep into their code, but what is your sense of how productive they actually are? Like what is the relationship between token usage and taking my company to the next level? I mean, it's very unclear, right? Some of these people may be just generating like worthless projects. The thing that worries a lot of the people I talked to about these leaderboards is that they just incentivize you to like run up your token count, right? Because then you look like the special, you know, 10x engineer, 100x engineer who's like outperforming all your colleagues.

“So I think there are a number of companies that see this leaderboard business as a little strange and maybe counterproductive, but I do think that there is a feeling among the most sort of heavy token users that they are being productive.”

I have to say when I read your column, I thought this just seems like it would create the worst incentive, right? There's this idea of good hearts law, right? Like when a measure becomes a target, it ceases to become a good measure. I can't think of a better way to ensure that tokens usage becomes a bad measure than creating a leaderboard for it. Totally. What are the people inside the company saying about that? Well, some of them are opposed to this whole leaderboard thing.

I also talked with some folks who defended the leaderboard. They said, look, it's never been all that easy to track the productivity of programmers.

Some people have had their productivity measured by like how many lines of code they generate or how many poll requests they made. These are sort of these imperfect proxies for like how hard are you working, how much are you doing. Lois of these companies also see this. I think wisely as a key to their own success. A number of these companies are now using AI token use and consumption as part of the performance review cycle. So you go in for your annual review. Your boss says, hey, it looks like you only use, you know, 70 million tokens last month, what's going on.

And so I think the engineers that these companies are getting wise to the fact that if they want to have a long successful career, they better start using some tokens. Yeah, but I imagine that some of them are really nervous about that though, right? Because like it seems clear to me that at least some of these companies want to incentivize token usage because the companies themselves suspect that the more we can get them using this stuff, the less long we will have to employ the humans.

“Maybe, although I think it's less about like the AI systems replacing the humans and more about like it is just a radically different way of working, right?”

These are people who most of them have had long careers in software engineering. They grew up writing code by hand. They maybe grew up using some sort of like AI assistant like GitHub co-pilot and what people at these companies are saying is that these agentic engineering systems are just really different. You have to approach them in a different way. You have to spend a lot of time with them to understand what they're good and not good at. And to them, this is sort of a way of motivating their employees to say, hey, go out and try the new thing.

Yeah, I don't know. I've been thinking a lot about this question of like if I were an engineer at one of these companies and I have this incentive to get on the leaderboard like how would I approach it? And I do think that like the instinct to just like waste a bunch of tokens to like rise higher on the leaderboard like ultimately if you rise too high people are going to ask you what you did with all the tokens.

If you're number one like 10 billion tokens and you only manage to you know like you know vibe code a calculator or something people are probably going to get mad at it.

And I actually did talk to one person who speculated that actually the people at the top of the leaderboards are all doing side projects. They're starting their their their side muscles. They're starting new company with the with the bosses money and if you're doing that I just want to say I salute you. Like that is the right way to work. Yeah, maybe don't be the number one on the leaderboard if you're doing that. Maybe try to stick around six or seven. Yeah, middle of the path is kind of where you want to aim yourself.

“I mean, let me ask is there any kind of token tracking that you think offers a reasonable signal like do you think that if you're like a tech company you should create a leaderboard?”

No, I think that's a bad idea for all the reasons that we just talked about including good hard slow, which is I think this is just going to lead to people just wasting tokens doing side projects. But if I'm the budget manager at a company and I'm seeing the people are spending multiples of their salary on AI tokens.

I'm asking them some questions about what they're doing with that all and if their answer is not I built an amazing new product that's going to generate billions of dollars a year in revenue.

I'm trying to say, hey, could you maybe use a little less than this month? Yeah, I have to say I have been struck at how this idea of the token leaderboard just represents a new incarnation of something that the software industry has been trying to figure out for a long time, which is how can I figure out if my software engineers are productive.

You know, I was talking recently to this very handsome software engineer who ...

And he was telling me that he used to be evaluated on how many lines of code he contributed.

And he told me about all the games that people used to play back in the day with, oh, you know, I wrote a quick algorithm to like, you know, translate a bunch of stuff into some new languages and it's like completely worthless. But it makes me look like I had a very productive week. And so I went back and looked into this and they were doing this in the 60s and 70s and there's this saying from the early days of computer programming that eventually arises that says quote measuring programming progress by lines of code is like measuring aircraft building progress by weight.

“I have to say I think that the same thing kind of applies here, right, that like yes, if you squint and at the right level of abstraction, it's probably true that some people who are using a lot of tokens are more productive than some people who are.”

It just does it quite seem like the right way to measure these things and I just wonder how quickly the industry is going to figure that out.

Yeah, I think it's going to be pretty soon in part because the budgets are just getting very ridiculous and and especially the AI model providers are now seeing individual users consuming amounts of their services that entire companies would have consumed just a few months ago. You know, it made me the kind of last question I have for you about this is just what implications you think it has for the broader economy right because we know that in so many different sectors of the economy managers are saying, I want to incentivize my employees to use AI and I want to track how they're using AI so do you think that is knowledge of these leader boards spread we're going to see people in non technical fields try to adopt their own version of them.

I think it's really a bad move not just for tracking actual productivity and output, but just for morale right like I remember years ago when like gocker would have like a traffic leader board at their office so you could see how many clicks your stories were getting relative to other people and I don't think anyone who like worked there at the time thought that was like incentivizing the right things or creating like high morale among employees basically everyone just was just competing with each other all the time.

And I think in this case it's even worse because it's not necessarily even correlated with like any success it's just pure sort of like you know how many agents can you run in a parallel swarm to sort of work 24 seven doing tasks of uncertain value which is a great question asked on a first date in San Francisco too, by the way, but anyways. I have to say I worry that this idea of like token maxing is going to spread into the broader economy I was talking with somebody who works in marketing this week and she was telling me that you know her job used to be evaluated solely on creativity and then recently the performance review got a new AI section and everyone is being evaluated on how much AI did use and like from her perspective she was kind of like this was working fine, you know like I didn't need to use like an AI tool to help me but now like you know my bonus might be based on how much of it I use.

So I think this thing is sort of already seeped out of the labs and is like getting into the water elsewhere and I just hope that managers are like really thoughtful about what they are incentivizing and that maybe like AI use for the sake of AI use is not going to be the boom to your company that you're hoping it is.

“Yeah, I think it's going to be very case by case. I think there will be people who are token maxing who are way more productive than their colleagues and doing way more projects way more quickly.”

I think there will be other people whose managers look at their like token budgets and say you spent this many tokens on what and we'll have to have some hard conversations but I think it's very hard to draw with a broad brush and say like all of this token maxing is pointless productivity theater.

It sounds to me from my conversations like some of it really is working for people.

Yeah, well I will say on the flip side, I've also heard of people in my social circle who have gotten in trouble for spending too much on. Yeah, and when I heard that I was like like your company is not going to make a bro. Like you got you got to spend on this stuff. Well, what's so interesting is now it's becoming part of job conversations for engineering jobs people are going into new jobs and saying well what's my token budget. And for the employees of these big AI labs who have unlimited free access to the models.

Some of them are using so many tokens that they effectively can't afford to quit their jobs right because anywhere else they would work would have to pay for their tokens and it would be completely unaffordable to employ them. Yeah, I mean with those sounds like real incentives and better than the ones that met it. Do you remember when met it was spinning up a super intelligence labs and they said you can sit really close to Mark Zuckerberg. If I were them I'd be like well take the tokens thanks. Well, just to wrap this up exactly how many tokens should a person use.

“I think I think that's you have to look within yourself within yourself.”

Yeah, okay. Yeah, that's between you and your God. Yeah, do what Mark and Dreson will not and introspects.

[Music]

Hard fork is produced by Rachel Cone and Whitney Jones.

We're edited by Virin Pavich.

We're fact check by Caitlin Love.

“Today's show was engineered by Katie McMurren.”

Our executive producer is Jen Poyant, original music by Rowan Nemisto,

Alyssa Moxley and Dan Powell.

“Video production by Soner OK, Pat Gunther, Jake Nickle and Chris Shot.”

You can watch this full episode on YouTube at youtube.com/hardfork. Special thanks to Paula Schuman, Clewing Tim and Dahlia Haddad.

“You can email us as always at hard fork and in WhiteTime.com.”

Send us your token budgets. [Music]

Transcript

Hi, I'm Solana Pine, I'm the director of video at the New York Times.

Decisively and quickly to drive durable, profitable growth.

Their stock price would shoot up by like 40,000 percent.

People don't know if they should be like using the AI tools a ton because the...

I do think Jasmine makes a really interesting technical case for why these mo...

Towards a very particular character or persona that is a very helpful assistant,

That was my other thing is aside from the verifiability

500 words or less, and like you've already given the prompt which I think is ...

So what I do now, I put this on a Claude project.

Yes, it's a token frenzy out there and the employees of these companies are c...

So at a number of these companies, I talked to people who said, yeah, this is...

You know, I was talking recently to this very handsome software engineer who ...

[Music]

Compare and Explore

More Transcripts You Might Like

Hudson v. Michigan

Immigration Law in 2026: Fighting the Cruelty Machine

Why AI Needs Ethics First | Nekia Nichelle & Shekhar Natarajan | Live at CES 2026

Success, Spirituality & Staying Real | Shilpa Shetty Kundra x Shekhar Natarajan

Will.i.am on AI, Data Ownership & The Future of Nations | Shekhar Natarajan Podcast

AI Needs Intention, Not Fear | will.i.am on Leadership, Creativity & the Future of Artificial Intelligence