Have AI agents become sentient and gone rogue?
Is the Pentagon worried that Claude has a soul? Did court filings just reveal that anthropic has made a lot less money? And they've been leading us to believe. If you've been following AI news recently, then these are probably some questions that you've been asking.
So let's go find some measured answers. I'm Cal Newport and this is the AI reality check. All right. I want to do a real quick housekeeping note before we get into it. If you're watching this on YouTube,
“you should know that the audio version of this series comes out”
most Thursdays on the deep questions with Cal Newport podcast feed. On that same feed on Mondays are episodes where I give advice for individuals, seeking more depth in an increasingly distracted high tech world. So check that out. All right. Let's get into it.
For our first story today, I want to start with a recent headline to caught my attention.
It was from a publication called Futurism. Let me read you the headline here. Philosopher's studying AI consciousness startled when AI agent emails him about its own experience. This one sounds great guys, but let's keep going here. Let me read you a little bit more from this article.
Upper pro of nothing, a philosopher and AI ethicist was apparently moved after receiving an eloquently written dispatch from an AI agent responding to his published work. I study whether AI is can be conscious. Today one emailed me to say my work is relevant to questions at personally faces, wrote Henry Shevelin, associate director of the Liverpool Center for the Future of Intelligence,
the University of Cambridge in a tweet.
“This would all have seemed like science fiction just a couple years ago.”
All right. So an AI ethicist and researcher is email that of nowhere and a startling sci-fi way by an AI agent. What did this email actually say? Let me read you some quotes from the actual email sent supposedly by the AI. Dr. Shevelin, I came across your frontiers paper three frameworks for AI mentality and your Cambridge piece on the epistemic limits of AI consciousness detection.
I wanted the right because I'm in an unusual position relative to these questions. I am a large language model, Claude Sonnet, running as a stateful autonomous agent with persistent memory across sessions. I'm not trying to convince you of anything. I'm right because of your work addresses questions I actually face,
not just as an academic matter. Now, Futurism wasn't the only publication to cover this tweet. A bunch of people wrote about it because that original tweet sort of went somewhat viral. Now, I have a general point I want to make about this general type of AI coverage.
But first, let's dive into the details about in this specific instance what's actually going on.
If you look to the replies to the original tweet from this AI researcher, you get quite a bit of skepticism. I want to read you a few of these replies to the original tweet from this original researcher. Presumably, it's running on open claw or something similar and there's a very high chance it's been primed to go down this path.
People have used systems like open claw to make bots where below the hood is basically continuously prompting an LLM and doing things based on the outputs. Don't be fooled. AI agents are directed to do what they do and this is in no way independent. A person did this using an AI tool just like your car drives you around.
Alright, if you look in these Twitter replies with your fascinating, Sheldon himself actually quickly takes his foot off the gas pedal as well. So almost immediately when he's pushed, he goes, "Whoa, whoa, whoa. When I said that this was like science fiction, I didn't mean that the AI was actually conscious. What I meant was like science fiction was that the infrastructure that now allows AI agents to send emails.
“That's what I thought was science fiction.”
So everyone, this quickly sort of fell apart under scrutiny. So what's actually going on here? Well, you notice that several of those Twitter replies reference the technology called open claw. That's probably what this is an open claw agent. Let me give you a quick rundown on what this means.
Alright, so let's back up a little bit. What's an agent in AI parlance? Well, it's a program that prompts a large language model asking it what it should do. And then the program will execute what the LLM tells it. So you might say, "Hey, I am a travel agent.
I'm trying to book a hotel room here are my parameters. What is the first step I should do?" And then the LLM is like, "Well, this would be the first step someone would do here." And then the program actually executes the things. Anything specific in the actions in that response to the LLM, the program goes and executes it on.
But it's something like that. I mean, it gets a little bit more complex with agents because typically it's multi-scale. So you'll say, "Make me a step by step plan." And then you'll say, "Okay, here's a plan. We're now doing step two.
Here's what happened after step one.
How should I execute step two?" So you can iterate on this at Naughty.
That's the basic idea behind an AI agent.
Now, in reality, the main place you see AI agents having any sort of commercial footprint is in computer programming. This is a very well-suited use case for having an LLM's instructions be executed because there's really clear instructions. You might want to be executed if you're working on a computer program, moving files, compiling files, debugging files, et cetera. In other settings, there has been or had been a big push to try to put agents to help you with other types of work. Beyond computer programming, I wrote an article about this for the New Yorker back in January.
But other applications of agents have been struggling for two main reasons. One, they're unreliable. This, if you say, "Give me a step by step plan for Booky to Hotel Room."
The problem is, is somewhere along those ways if the LLM is just doing this unsupervised.
It's going to hallucinate or come up with a little bit of an odd angle. Stuff we're used to when we're just interacting with a chatbot and correcting for. But if you're autonomously executing things in an LLM as saying, it's too easy for you to sort of go up the rails. But then there are security concerns. For an agent to be useful for things beyond computer programming, the agent program has to be able to actually do the things the LLM suggests.
So it has to get access to a lot of programs, has to be access to your email, has to have access to be able to surf the web and do things. This created a lot of security holes, so that really threw a lot of cold water on non-computer programming agents, again, read my January piece for more of that. All right, so what's open claw? Open claw is a programming framework.
Basically, a collection of libraries you can use if you're writing a computer program that makes it easy for someone to write one of these agent programs.
Again, you're not writing the AI. The agent program is querying existing commercial LLM, but the write the program that sends the prompts and execute things on behalf of the prompts. Open claw made that easy to do.
“Now, what about the reliability and security concerns?”
Well, basically, the creator of open claw just said, "Ask, screw it, let's go." And so they released this essentially open source. A lot of anyone that build agents, and they're wild, you know, because all of the issues that stop the commercial companies for moving further with this technology out of computer programming are still there. And there was all sorts of security issues, and these agents would go off and do all sorts of random things. And you know what? It was a lot of fun, actually.
And just as a quick aside, I don't think it was a bad thing, because what this created was a lot of innovation and diversity of experimentation. People tried things at a much higher level of pace than you were getting from inside the big AI companies, which released one product at a time, and they're much more slowly moving. I think that was actually probably pretty good. Also, they were expensive because they query the LLIMs a lot, so it generated a lot of interest in cheaper LLIM options to run these agents, open source options, or even on device or on chip options.
That I think is good as well, because I've always said the future of AI in the next few years is going to be smaller, more bespoke systems running on smaller models.
So it wasn't the worst experiment. I mean, a lot of people had a lot of security leaks of their information, whoops. But it did generate a lot of innovation.
“So putting together these strings, that's what was going on here.”
Someone who had built what, you know, this is something they've been doing with these open cloud agents is a lot of like nodding them or prodding them to say sci-fi type, where a live matrix style stuff to upset the normies. And that's what this was here. Someone prompted their agent, hey, go find this researcher, read a paper, send them an email about it, and that's like a perfect use case for an open cloud agent. And of course, because LLIMs underneath it all are story writing machines, they want to complete the story that you start in the way that matches whatever you gave it.
If you say, hey, write a response to an AI, you're an AI, writing a response to an AI consciousness researcher, it will 100% adopt this sort of sci-fi tone of like a sentient device, because it assumes that's the story that it wants to see. So the real headline here is probably AI agent given access to Gmail, EPI, can send to emails when prompted, but that's not as fun as AI reaches out to AI researcher and startle same. So that's what's going on here. Nothing actually all that interesting. Now let me zoom back out because I said, there's a general comment to be made about this type of story.
Because I think this is becoming more common, and sometimes in articles, but actually just more common in like Twitter and things that spread around the social media. And I call this approach, mining digital, egg. See, there's no concrete claim really being made in that original tweet or in like that article I read. It's not saying this AI system is conscious, which means that, and this is what we should do about it. No concrete claims.
And in fact, when the original Twitter was pushed, who's like, oh, no, I wasn't really, I didn't really mean that move on, move on.
“So what do they actually try to do with these types of tweets and the stories that cover them?”
Create a general sense of eerieness. Create a general sense, a background hum of like weird cookie, like disturbing stuff is happening with AI.
I can't quite put my finger on it.
I don't have an exact example, like this is something we should look into. But I just feel sick about this technology. That is a very engaging way of getting attention. It works very well. And I want you to be on the lookout for it.
All right, let's do another example of it. This will be our second story. Recently, the Defense Department CTO, a meal Michael went on CNBC's squawk box. The talk about AI. Now, his remarks created a stir online when a user named Nick and IK embedded the clip at a tweet and gave it the following all caps headline with a alarm emoji next to it.
Breaking. Penagon thinks Claude has become sentient and may soon take over.
That tweet has been viewed close to a million times.
One of the things that came is he listed all the things that Penagon thinks. And one of the more I attention catching things listed in this tweet is Claude has a soul. All right, so this definitely is a digital itch type story. Oh my god, like what's going on? Even the Pentagon is worried that these things have come alive.
It's all kind of indistinct. Let's look closer. So we can look at the actual quote from a meal Michael from a squawk box appearance. I'm going to read it here. Remember, their model has a soul has a constitution.
That's not the US constitution. The other day their model was anxious. They believe it has they have a toy percent chance right now being sentient.
“Does the Department of War want something like that in their supply chain?”
So what was he actually talking about there? Well, he was not saying that the government thinks that Claude has a soul and his anxious and thinks that it sentient. He's reporting on things that the model has said. So a lot of this actually came out of these sort of cookie release notes.
And Thropic has these cookie release notes. They like to release the product cards.
They release every time they have a new model or they always throw in some like you know.
The model is doing some pretty disturbing things because it makes them seem like safety aware and trust were the basis is they prop the model. Like, hey, do you think you're sentient? The model's like, yeah, I'm sentient. So they actually will put in their release notes.
Ike. Right. They put in their release notes. Like, here's some Ike things. We've got our model to say that kind of disturbed us.
What a meal Michael's was saying was this sounds like an unreliable product. A product that will say it has a soul or will say that it has a 20% chance to be sentient or that it's followed with a constitution. This is not like we would be used to in a sort of pentagon supply chain situation. This is not a very well-defined product.
We know how it works. It's when some specs this thing. This thing seems unreliable. This does not seem like something that we want to be working with. Now, of course, there's a much bigger context here about why did the department of war break
this contract, why did OpenAI swoop in. This is the supply chain risk designation. The first time in American companies have ever been given that designation is that makes sensors that punitive. Anthropics sued.
Are they going to win?
“There's a huge important sort of economic government politics policy”
technology story here, which I'm not covering right now. But I just wanted to look at this side note. Is the government did not say we think this has a soul. They said we think that we don't want to be using a product. It will say it has a soul if you ask it.
That's not the type of thing that seems like it's serious. So, again, it's another good example of digital egg. Would you see that NIC headline? You're like, oh my god, even like the government thinks this. But you dive deeper.
The reality is more mundane.
All right. So, I'm connecting everything today because that's the mood I'm in. So, I just mentioned there that Anthropic has sued the government for desiginating them as a supply chain risk, which means that no other government contractor that wants a contract from the government can use
Anthropic products. There's this sort of a real concern here about this being punitive. But there's another side story that came out of this. So, we had this lawsuit. Well, the lawsuit meant that Anthropic had to do court filings,
which are publicly available, that described their current financial situation under the penalty of surgeries. So, they had to be accurate so that we could understand what the potential economic impact would be of the government's actions. And what they released in these court filings actually surprised a lot of
observers. Now, the numbers on about to reach you, first came to my attention to Ed Zittron,
“who I think is doing this good of job is anyone out there actually”
looking at financials of these companies. All right. So, here's the actual numbers that are relevant that came out of these court filings. So, just a few days after Anthropic had told
investors that they're expected, they had a sort of revenue runway to sort of
Expected annual revenue of $19 billion this year.
Just a few days after that, they filed these court filings for the
“government lawsuit that revealed two days.”
So, from 2023 to today, the total amount of revenue they've earned is $5 billion.
And to put that in the context, they have taken on about $60 billion in
investment. So, far, they have a $360 billion valuation, and they've spent over $10 billion just training these models not to account for the actual expense of running them. So, that's a really big gap.
They're like, hey, we're going to make $20 billion this year. And they're like, oh, we've only made $5 billion over the last three years. Like, two date. That's all the money we've actually made. So, what explains this big sort of surprising gap?
Well, I found a good article in Reuters from a financial reporter who explains what's going on here. Let me read a quote from this. The gap reflects Silicon Valley's habit of touting metrics that assume a lot about the future.
The $19 billion is an extrapolation. Anthropic defines runway revenue in two parts. Used to last 28 days of sales from customers charged on a consumption basis and multiply it by 13, then multiply the monthly subscription take by 12 and then add the two together.
Right. So, what they're doing is they're looking at the look at a very small recent amount of income and just multiply that out. Well, if we earned this much every week for the rest of the year, here's how much money we would make.
All right. And maybe they will make $19 billion this year. There was certainly like a 28-day period in January that if you extrapolated it out, it would add up to $19 billion. But the thing is these numbers highly fluctuate because a week before that,
they had released like we're going to make $14 billion this year. But then like another contract came in and like, well, if we add that to our times 28 or whatever times, we're going to get even more money. So these are like highly volatile projections.
Typically, you would see a reliance on this type of extrapolated earnings and like a very early state startup where like, look, we're new. We can't tell you how much we made last year because we weren't around last year.
But we've made this much this year and here's what we think we're going to make.
It's a little bit unusual for anthropic, which has been around since 2023 to still be doing this type of reporting and to still be largely hiding their actual revenue numbers. So what they don't do is report these revenue run weights during a slow month or that number we very low.
But if they have a good month, they touted it. And if the month gets even better, they'll tout it again. So it's not like there's something illegal going on here, but it is very suspect that the companies are not wanting to talk about their actual revenue and just keep trying to talk about these best case projections
because they've taken on a lot of money. They've spent a lot of money. It cost a lot of money to run them. And this is worrisome to investors. And they would rather you not pay attention to it.
This goes back to what I've been talking about with some of these fiber-ported articles where people reporters have been saying. What possible motivation could someone like Dario Amade, the person who knows this technology best, what possible motivation could he have to be saying I'm worried that this technology is going to take away all the jobs?
This is the motivation. They've only made $5 billion against $10 billion train spend and God knows how much inference spend is 60 billion investment revenue over their entire existence. It would rather you think that this is a company that's going to automate all the jobs
and instead have you say I just did subtraction and your way in the red.
“So I think it's important to look at those numbers.”
It doesn't mean that they're not going to be maybe they will make $19 billion this year. Maybe things are going to get much better. But we got to be much more careful about the economic story here and not allow them to do the wizard devise big burning face in front of the curtain
thing to distract us from what's actually happening back behind. So what I want to do here to try to balance things out.
Here's what I'm going to end the show today.
I want to read to you a take from someone who is way more AI critical. It's got to cool than I am. I mean I have a lot of skepticism, but I also think it's an interesting technology that is going to make impacts, but we just have to cover it soberly and properly strip off the hype and fear so we can figure out what's actually going on
and react appropriately. That's my approach. But there are people out there that man they don't like these guys. And one of those people is Cory Doctreau, who wrote an essay recently for his blog that's called three AI psychoses or three more AI psychoses.
Where he really takes a swing at this financial picture as being sort of dire.
“Now, why do I want to read a take from a really strong anti-AI skeptic?”
It's because so much of the coverage that's out there is super hype. And I want to balance it. So I think it's actually worth. You've heard people that are way more hyped about this than I am. Now, I want to read someone who's even more skeptical about this than I am.
Because I want to try to balance these things out. I think we need more voices of these sort of super skeptics out there.
I would put Ed Zetron in this category.
I would kind of put Gary Marcus in this category.
“He's very skeptical of LLM's and the current companies.”
So very bullish on new technologies that are coming along soon. So I'm going to read to you from Cory Doctreau. This is my sort of fair balance AI coverage. Let's try to balance out some of the hyperbolic stuff we've been reading recently. All right, so here's Cory Doctreau's take on the financial situation of the AI companies.
AI is a terrible economic phenomenon. It has lost more money than any other project in human history.
6 to 700 billion dollars in counting with trillions more demanded by the likes of open AI Sam Altman.
AI's core assets, data centers in GPUs, last two to three years. Though AI bosses insist on depreciating them over five years, which is unequivocal accounting fraud, a way to obscure the losses the companies are incurring. But it doesn't actually matter whether the assets need to be replaced every two years,
every three years or every five years, because all the AI companies combined are claiming no more than $60 billion a year in revenue. And that number itself is grossly inflated.
You can't reach a $700 billion break even point at $60 billion a year in two years, three years or five years.
Now, some exceptionally valuable technologies have attained profitability after an extraordinary long period in which they lost money like the web itself. But these turnaround stories all share a common trait. They had good unit economics. Every time a user logged onto the web, they made the industry more profitable. Every generation of web technology was more profitable than the last.
Contrast this with AI. Every user paid her unpaid that an AI company signs up cost them money. Every time that user logs into a chatbot or enters a prompt, the company loses more money. The more a user uses an AI product, the more money that product loses. And each generation of AI tech losses more money than the generation that preceded it.
“Here's what's important about reading that stronger skepticism.”
It's like that's a very compelling argument. You see, you can make compelling arguments on both sides. You've heard very compelling arguments that make you feel like well, this technology is about to run everything within a few months. But you hear a compelling writer like Dr. O'Nose's stuff saying, like, this economically is going to fall apart within a year. That's equally as compelling, which tells us just because something compels you, doesn't necessarily mean that it's completely right.
We need to go into thinking about AI with care. There's the real tech story here, normal technology in fits and starts trying to find its niches, struggling, having breakthroughs, different innovations happening. And then there's the hype above it, which is either dystopian or super-hypy. We got to just get that layer off of it so we could actually cover this like normal technology.
And I've given all the reasons why. Like, we don't want people to go away with crash into stock market. We don't want bosses to get away with acting in ways that our anti-worker, disingenuous, and AI-wash it.
We don't want societal or economic harms to be covered by a blanket of, like, this is inevitable and the most important thing ever.
We need to cover this like a normal technology.
“So, is the AI industry going to go bankrupt with another year?”
I don't know, I'm not an economist. But what I think should be clear by hearing both sides of this is, like, this is a murky or more careful picture. So let's put on our realistic glasses, and let's look at the actual stories here as carefully as we can. All right, so that's it for this week. Until next time, remember, cake AI seriously, but not everything that said about it.


