Today on the AI Daily Brief, a big dust up around anthropics new product.
How much of it is about price and cost versus some larger existential on we? Before that in the headlines, the open qualification of the world continues.
“The AI Daily Brief is a daily podcast and video about the most important news and discussions”
in AI. Alright friends, quick announcements before we dive in. First of all, big thank you to today's sponsors Recall.ai, AI UC, Blitzie and Robots and Pencils. For an ad reversion to show, go to patreon.com/aidelebrief, or you can subscribe on Apple Podcasts to learn about sponsoring the show, send us a note at [email protected].
Aidelebrief.ai is also where you will be able to find all of the things happening in this ecosystem, but with that out of the way, let's talk in video and the open cloud revolution. We have been tracking closely the "clothification" of the world, and last week, no less than
Jensen Wong had some very positive words for the project, calling it maybe the most important
release of software ever. It felt perhaps like a bit of hyperbole, but with a new report from fire, it makes a little bit more sense. Why reports that Nvidia is planning to launch their own AI Agent platform that is not dissimilar to open cloud.
They write "in video" is planning to launch an open source platform for AI agents. The chipmaker has been pitching the product referred to as "nemo claw" to enterprise software companies. The platform will allow these companies to dispatch AI agents to perform tasks for their own workforces.
Companies will be able to access the platform regardless of whether their products run on Nvidia chips. Now the timeline for this seems to be around Nvidia's annual developer conference, which happens next week.
Nvidia apparently has been reaching out to very premier partners like Salesforce Cisco, Google
Adobe, and CrowdStrike for partnerships around the platform. But wired rights, for Nvidia, "nemo claw" appears to be part of an effort to court enterprise software companies by offering additional layers of security for AI agents, it's also another step in the company's embrace of open source AI models. Part of a broader strategy to maintain its dominance in AI infrastructure at a time when
leading AI labs are building their own custom chips. Nvidia software strategy until now has been heavily reliant on its CUDA platform, a famously proprietary system that locks developers into building software for Nvidia's GPUs, and
“has created a crucial mode for the company.”
What's interesting is that last year there was a lot of discourse about this idea of Nvidia moving up the stack, and diversifying away from just pure chips. Sort of as a hedge against how the world might change and positioning themselves for those potential different outcomes where people are less reliant on Nvidia chips. Whether it's because they've got their own custom silicon, or because the nature of
the field has changed. I feel like there's less of a sense of that being likely outcome right now than there was last year.
You've basically seen a lot of the big players like Meta, seemingly back off of their custom
silicon projects, not I don't think because they're not interested anymore, but because the simple reality is that right now, they just need the compute at basically any cost, and don't have time to wait around to figure out their own systems. Now, I don't think that Nvidia, as smart as they are, is going to stop hedging against future changes, but it will be interesting to see if and where any of these various experiments
that they have outside of chips themselves, start to actually become a more significant business line for the company in the future. Next up, Microsoft gets in the co-work in-game. On Monday, Microsoft CEO Sacha and Adela tweeted, announcing co-pilot co-work, a new way to complete tasks and get worked on in M365. When you hand off a task to co-work, it turns your request into a plan and executes it
across your apps and files, grounded in your work data and operating within M365 security and governance boundaries. Axio sums up the move this way. Microsoft launched co-pilot co-work on Monday and enterprise AI agent built on anthropic technology and named after the anthropic product that wiped hundreds of billions off of
Microsoft's market cap. In other words, if you can't beat them, join them. And indeed, this is not just a copycat version of co-work. This is actually a collaboration within thropic. Or can closely within thropic, they write, we have brought the technology that powers Cloud co-work into Microsoft 365 co-pilot.
Microsoft is also increasingly pushing the idea of being able to select between different models. In that same blog post, they write, your work is not limited by one brand of models. So pilot hosts the best innovation from across the industry and chooses the right model for the job regardless of who built it.
Now, there is of course a lot of meaning going around about Microsoft being behind or just
“copying others, but in this case, I think their speed to response on this is actually pretty”
good. There are lots and lots of people who by virtue of their work environments are stuck in the co-pilot ecosystem and for there to be less than a two-month gap between when and thropic drops co-work and when Microsoft offers their version in co-pilot, that's a lot better than co-pilot users have expected in the past.
I also think there's a certain humility and intelligence, in not trying to do a janky version of it, but just partnering with anthropic to actually get the thing close to the same level of capability that the Cloud version has. Sean Wang writes, "Wait, did Microsoft really clone Cloud co-work? That's kind of based."
Still, Ethan Molly brings up the big question that Willer won't probably dictate success for this, that will have a big impact on whether this thing is seen as successful. Molly writes, "Microsoft seems to be launching its own branded version of co-work. A big question is whether it will continue to use lower-end models without telling you.
Whether it will keep pace as the space evolves or is it a one-off.
To me, the question of whether Microsoft will give access to the most recent and best models
“is big, given that GPT-5 beat or tied humans in expert tasks less than 38% of the time,”
while months later GPT-5.4 beat or tied human experts 82% of the time, this really matters. Another big question he writes, "Is this limited to producing materials that use Microsoft apps? How does it handle the fact that so much of what makes Cloud co-work interesting is the fact that it can improvise all sorts of output using code?"
Having a little bit of trajectory context to this, Brett Winton from ArcShare, the revenue projections from anthropic and open AI, as compared to Windows and Office's top-line revenue, showing that if indeed anthropic and open AI are correct, they will exceed Windows and Office revenue sometime in 2028, as Brett wrote, "What Microsoft built in around 40 years, they will have surpassed in around 5."
Next up today, we finally get some news about former meta-AI chief, Yon LaCoon's new startup,
"Am I Labs has raised $1 billion in what is Europe's largest seed round ever?"
The company is officially called Advanced Machine Intelligence Labs, and raised from TemeSec, Bezos Expeditions, and in video. Setting some expectations, New CEO Alex DeBruhn says, "Anything that involves understanding the real world, we think large language models in generative AI in general is not the right solution.
We have at least a year of research before deploying our first real world applications, but this is not an applied AI company." Learning in on what the company is doing, LeBruhn also told TechCrunch, "My prediction is that world models will be the next buzzword. In six months, every company will call itself a world model to raise funding."
“Certainly between faith-values, world labs, and Google's genie, I think we're likely”
to see a lot more from world models in 2026. Lastly today, more consolidation in the AI space, Open AI is acquiring AI security platform prompt food. Now, is interesting about this as less the deal itself and more what it implies for Open AI and their seriousness going after the enterprise.
They write that once the acquisition is finalized, they will integrate prompt food technology directly into Open AI Frontier, which is of course their platform as they put it for building
an operating AI co-workers, basically their enterprise platform.
They write, "As enterprises deploy AI co-workers into real workflows, evaluation security and compliance become foundational requirements. Enterprises need systematic ways to test agent behavior, detect risks before deployment, and maintain clear records to support oversight, governance, and accountability over time." One of the things that I expect to see this year is a ton of consolidation around building
the true enterprise AI stacked inside the big labs, keep an eye for more acquisitions in that theme, but for now that is going to do it for today's headlines. Next up, the main episode.
“Why is there always a meeting bot in your Zoom call?”
BlameRecold.ai Recold.ai powers the meeting bots and desktop recording apps behind products like Cluey, Hubspot, and ClickUp. They handle the hard infrastructure work, capturing clean recordings, transcripts, and metadata across Zoom, Google Meet, Microsoft Teams, in-person meetings, and more, so developers don't have to build with themselves. If you're building a meeting note-taker or anything involving conversational data, recall.ai
is the API for meeting recording. Get started today with $100 in free credits at recall.ai/aidb, that's recall.ai/aidb. Here's a new standard that I think is going to matter a lot for the enterprise AI agent space.
It's called AIUC1, and it builds itself as the world's first AI agent standard.
It's designed to cover all the core enterprise risks, things like data and privacy, security, safety, reliability, accountability, and societal impact, all verified by a trusted third party. One of the reasons it's on my radar is that 11 labs, who you've heard me talk about before and is just an absolute juggernaut right now, just became the first voice agent to be certified
against AIUC1, and is launching a first-of-its-kind, insurable AI agent. What that means in practice is real-time guardrails that block unsafe responses and protect against manipulation, plus a full safety stack. This is the kind of thing that unlocks enterprise adoption. When a company building on 11 labs can point to a third-party certification and say our
agents are secure, safe, and verified, that changes the conversation. Go to aiuc.com to learn about the world's first standard for AI agents, that's aiuc.com. You've tried in IDE co-pilets, they're fast, but they only see local silos of your code. Leverage these tools across a large enterprise code-based and they quickly become less effective.
The fundamental constraint? Context. Blitzy solves this with infinite code context, understanding your code-based down to the line-level dependency across millions of lines of code. While co-pilets help developers write code faster, Blitzy orchestrates thousands of agents
that reason across your full code base. Allow Blitzy to do the heavy lifting, delivering over 80% of every sprint autonomously with rigorously validated code. Blitzy provides a granular list of the remaining work for humans to complete with their co-pilets.
Tackle feature additions, large-scale refactors, legacy modernization, greenfield initiatives, all 5x faster. See the Blitzy difference at Blitzy.com, that's BLITZY.com. Today's episode is brought to you by Robots and Pencils. A company that is growing fast.
Their work is a high-growth AWS and Databricks partner, means that they're lo...
elite talent ready to create real impact at velocity.
Their teams are made up of AI Native Engineers, strategists, and designers who love solving hard problems, and pushing how AI shows up in real products. They move quickly using RoboWorks, their agentic acceleration platform, so teams can deliver meaningful outcomes in weeks, not months. They don't build big teams, they build high-impact number ones.
The people there are wicked smart with patents, published research, and work that's helped shape entire categories. They work in velocity pods and studios that stay focused and moved with intent. If you're ready for career-defining work with peers who challenge you and have your back, Robots and Pencils is the place.
Explore open roles at robotsandpensils.com/careers, that's robotsandpensils.com/careers. Welcome back to the AI Daily Brief. In 2026, the one thing that's clear to everyone is that things are moving very fast. Even for an industry where it already felt like things were going quickly, we've ratcheted up another notch.
As part of that, everyone is grappling with a series of different issues. Something from the very positive, how do I take advantage of all these new superpowers that I've been given? To the exciting, but it's a challenge kind of questions, like how do we redesign our organization around these new capabilities?
To the much more existential questions of what does it mean that the work that I've always
done is no longer the work that I will be doing? In many ways, it feels to me like all of those debates came home to roost around a single product this week, which is Anthropics' new "Code Review" feature. Now, this is not a particularly complicated product to explain. Mod rights? When a PR opens, Claude dispatches a team of agents to hunt for bugs.
“"Code Review" is a key part of the development life cycle, so it stands to reason that”
AI would be trying to add new efficiency to it. And certainly Anthropics is not the only company thinking in these directions. Cognition recently released Devon Review, which they call a "reimagined interface for understanding complex PRs," and their announcement tweet they wrote, "Code Review" tools today don't actually make it easier to read code.
Devon Review builds your comprehension and helps you stop slop. Now they go through a whole bunch of ways in which the product is different, and it got pretty good response.
A thousand people bookmark that tweet, and three quarters of a million people viewed it.
That is of course nothing compared to the nearly 14 million who viewed the Claude post, which speaks not only to the relative size of Anthropics, but to the controversy surrounding this new product. So what actually was controversial? On the surface of it, this seems like it would be highly value-added.
While they are biased and incentivized to say so, certainly it seems like all the folks inside Anthropics who are using it have had really positive experiences with it. Alex Albert, who does Claude in Devon, says this has been a game changer for our internal engine research teams. Rare to see a product get this much praise from some of the top engineers I know.
Boris Cherney, the creator of Claude Code, points out, "We built it for ourselves first." Code output per Anthropics engineer is up to 200% this year and reviews for the bottleneck. Personally, I've been using it for a few weeks, and I've found it catches many real bugs that I would not have noticed otherwise. Here at Sumner Rights, been using this in "Buns repo," "Bun JavaScript being a company
that joined Anthropics recently," Jared continues this in my opinion as the best product in the code review category today. It regularly catches extremely subtle bugs and rarely makes mistakes. Claude Code's to re-grides, "Code review is so, so good."
“One of those things I can't remember how I lived without.”
What's more, the discussion of code review, and the inevitable changes to it, is something that the larger-agentic engineering community has been talking about recently irrespective of this Claude product. John Langslash swicks of late-in-space wrote, "This is the final boss of agentic engineering, killing the code review."
At this point, multiple people are already weighing how to remove the human code review bottleneck from agents becoming fully productive. I'm not personally there yet, but I tend to be 3-6 months behind these people, and yeah, it's definitely coming. Now, he points to a guest essay, "Share it on late-in-space by Entrepreneur Ankit Jane,"
called "How to Kill the Code Review." The sub-header, which encapsulates that thesis pretty clearly, is human-written code died in 2025, code reviews will die in 2026. I won't read the whole thing by a couple of excerpts. Humans already couldn't keep up with code reviews when humans wrote code at human speed.
Every engineering org I've talked to has the same dirty secret. PR sitting for days, rubber stamp approvals, and reviewers skimming 500 line difts because they have their own work to do. We tell ourselves it is a quality gate, but teams have shipped without line-by-line review for decades.
The code review wasn't even ubiquitous until around 2012 to 2014, one veteran engineer told
“me, "There just aren't enough of us around to remember."”
And even with reviews, things break. We have learned to build systems that handle failure because we accept that review alone wasn't enough. This shows in terms of feature flags, rollouts, and instant rollbacks. The next section in the core thrust of Ankit's argument is called "We have to give up on
reading all the code." He continues, "Teams with high AI adoption complete 21% more tasks and merged 98% more pull requests, but PR review time increases 91% based on data from over 10,000 developers across 1255 teams." Two things are scaling exponentially, the number of changes and the size of changes.
We cannot consume this much code. On top of that, developers keep saying that AI generated code requires more effort than reviewing code written by their colleagues.
Teams produce more code than spend more time reviewing it.
There is no way we win this fight with manual code reviews.
“The code review is a historical approval gate that no longer matches the shape of the work.”
Now Boris Tain wrote something about this as well. He's more broadly themed piece from February of this year was called "The Software Development Lifecycle is Dead." Boris writes, "A.I. agents didn't make the SDLC faster. They killed it.
I keep hearing people talk about AI as a 10x developer tool. That framing is wrong. It assumes the workflow stays the same and the speed goes up. That's not what's happening. The entire life cycle, the one we've built careers around, the one that spawned a multi-billion-dollar
tooling industry, is collapsing in on itself. And most people haven't noticed yet. Boris argues that the software development life cycle as we learned it is a relic. He writes, "Here's the classic software development life cycle most of us were taught." In apologies for those of you who are just listening, but basically it's a circular chart
that goes from requirements to system design to implementation to testing to code review to deployment to monitoring and then back to requirements and through the system again. Boris writes, "Every stage has its own tools, its own rituals, its own cottage industry. Jira for requirements, Figma for Design, VS Code for Implementation, just for testing, GitHub for Code Review, AWS for deployment, data dog for monitoring."
Each step is discrete, sequential, hand-offs everywhere.
Now here's what actually happens when an engineer works with a coding agent.
In this chart, there is one starting point which is intent, which moves to the agent, and then the agent works in a circular fashion, through code plus test plus deployment, to the question of does it work, if the answer is no, it's back to the agent, for more code tests in deployment, back to the question of does it work, and then as soon as the answer to does it work is yes, the code gets shipped.
Boris's point is this, quote, "The stage is collapsed, they didn't get faster, they merged. The agent doesn't know what's step it's on because there are no steps. There's just intent, context, and iteration." Boris is talking about the entire development process, but to relocalize it back to code review, which is the subject of this particular product, his section on code review is called
"Give it up." Boris writes, "The pull request flow needs to go. I was never a fan, but now it's just a relic of the past. I know that's uncomfortable." Code review is sacred, it's how you catch bugs, share knowledge, maintain standards.
It's also an identity thing, where engineers in reviewing code is what engineers do, but cleaning to the PR workflow in an agent-driven world isn't rigor, it's an identity crisis. Think about it, an agent generates 500 PRs a day, your team can review maybe 10, the review queue backs up, this isn't a bottleneck worth optimizing, it's a fake bottleneck, one that only exists because we're forcing a human ritual onto a machine workflow.
All right, so the point here that I'm trying to make is that clearly there is something in the air, and big questions that perhaps an inevitable change coming to the way that we think about code review. And yet still, I was genuinely surprised to see how much and tipathy there was towards this code review announcement.
There were a few reasons for that. One has to do with a sort of who's going to watch the watchers idea, Professor Bo Wang writes, did Claude Code write Claude Code review?
“Next question, can Claude Code review review Claude Code's code and make it better?”
And even create a better Claude Code review? Now, he's a little bit tongue-in-cheek, but the idea of whether the code review is likely to bring the same biases to the review that might have created the mistakes in the code
in the first place if people wrote their code with Claude Code is I think maybe a more
practical question that a lot of folks have. The much bigger part of the response came around cost. The big thing that really caught people's attention was around the pricing. In the pricing section of the Claude Code review docs, it says, "Code review is build based on token usage.
Reviews average $15 to $25 scaling with PR size, code-based complexity, and how many issues require verification." And boy, where people shocked at this. VAR apps along rights? The Claude Code max $200 a month plan is literally infinite tokens.
You can just write the one prompt to do a PR review locally, save it as a scale and you get unlimited reviews. 15 to 25 per review is nuts, Dexter Labs Nick Schrock writes, "15 to 25 USD per review my Lord." Alex Kaplan says $20 for a PR, Headblown emoji, exclamation question mark emoji,
DevonReview.com is free. So one part of this I think is just a sticker shock argument. If you've got most developers used to paying in the tens of dollars for coding tools and seeing review type features bundled into a broader plan, then this amount obviously seems much larger.
It's more people are immediately doing the scale math. If a team opens up lots of PR's 25 per review sounds like it could explode very quickly into hundreds of thousands per developer per month. Now, it doesn't really matter that an anthropic is explicitly targeting a deeper review experience using multiple specialized agents, I.e. probably not using it for every single
“time you have to review anything, but still people are just extrapolating out from that”
number and coming up with some very big numbers on the other side. Another piece of this, though, is that I think it shows some chunks in the anthropic and opus armor right now, for a very long time, anthropic was the only game in town when it came to coding. This has been well documented on this show to the point where we don't really need to
discuss it. However, ever since the release of GPT-5, OpenAI has been explicitly attempting to close that gap and even get out of head and increasingly there are some evidence that that ever
Has been successful.
Westwinder writes, "I really don't understand why you would pay $25 for Claude to review a single PR when Opus 4.6 isn't even the best model for deep code review." GPT-5.4 is the only model I trust for reviews right now. Shopify product builder Gill writes, "Imagine spending $15 to $25 on code review and you still have daily downtime and buggy releases.
I'd be more confident in this feature if their production quality was higher." Fairman's T-BO writes, "In all our benchmarks, Claude reviews are always just a worse, but don't you worry now you can pay between $15 and $25 per freaking PR and you'll have good reviews. Are you kidding me?
And even some of the first people who are testing it aren't necessarily coming away all that impressed."
Daniel Sand tested Claude code review and said, "I'm always the first to get excited
when Claude ships something new, but in this case enabling code review is just not worth it." Now none of this is to say that there aren't people who are taking the other side of this argument. Lindy founder Flo writes, "People's comments on the $15 to $25 per PR price tag remind
me of Michael Bloomberg's answer to people blocking at the $2700 per month cost of the Bloomberg terminal. If you can't make $2,700 a month with our product, you've got bigger problems to deal with." Hoping code's resellivan writes, "A $15 to $25 PR review bought that catches an incident
that would have cost the company $5 million in breached SLAs and reputation is a no-brainer."
“I think maybe ultimately, the even more interesting dimension of the cost part of the conversation”
actually has to do with the implications for where things are going. I think increasingly, as AI and especially AI coding weaves itself deeper into how we do
work, cost profiles which were somewhat ignorable before become not ignorable anymore.
Another way of saying it is that AI inference costs start to look a little bit closer to labor costs than to software costs. Sourcecraft CEO Dan Adler writes, "I spend much of my week every week talking to large enterprise buyers. The appetite for tokens is insatiable."
See level FOMO is off the charts and every spare dollar is going into cloud code, cursor, and etc. Tens are even hundreds of millions of dollars in engineering organizations that cost billions and salaries seems reasonable, but if CTOs can't deliver headcount savings, we're going to see some real whiplash on token budgets in the next two to four quarters.
Another way to put what Dan is saying is that something's got to give. The cost of a gentle engineering can't keep rising without there being commensurate costs cuts, somewhere else in the organization. Anonymous 4O account on Twitter writes, "This marks the beginning of the end of the subsidized inference era.
It will only go higher."
“I think we're just beginning to grapple with what the full-bore cost of AI when fully”
utilizes going to look like in what it means for the structure of organizations. There is, of course, however, another piece of this. One that Boris got in that essay that I read before, has he put it, "It's also an identity thing, where engineers in reviewing code is what engineers do." From some corners you can almost feel the existential nature of the response.
Look at how Montana puts it. We need to admit defeat. We won't be reviewing code before it goes to production. Diamonds are already the bottleneck. Now it wasn't strictly related to this release, but there's been a viral video going
around Twitter/X from Mo@MoIO. With the caption on the video, I was a 10X engineer now, I'm useless. It's actually less dramatic than the caption makes it sound. But it's a real honest exploration of a lot of the feelings that many developers are having right now, as the fundamental nature of what they do as developers has changed underneath
their feet almost overnight. And it does feel to me a bit like part of the response to the code review, feels a bit like watching the last part of the sandcastle that they've spent their whole lives building, washed away into the ocean. And why I think this part matters, regardless of whether you're a engineer, is that as I've
“frequently set on this show, if you want to understand what other types of knowledge workers”
are going to be feeling like in a year or two years, watching how developers handle these changes, and how the broader shape of their field is shifting, is the closest thing we have to peering into the future. There is a deep set of existential questions in this liminal moment, and I think how folks resolve them on a personal and professional and organizational level is going to create a template
in a blueprint for how we deal with AI disruption in other areas. Now there is one more piece of the negative response to code review that I think is worth tracking as well, which is not just about cost but about pricing power. And this to me may be represents another chinkin in an anthropics armor, although I don't think it's limited to anthropic alone.
Garb writes, feels like the wild west days of pricing. The general store has you hooked on their supply, they know it, and they're telling you how much they're going to fleece you, because they can.
Ejaz writes, "Lol and Thropic just killed a $50 billion industry with a single feature
again. Companies pay 50K a year to scan their code for vulnerabilities, and Thropic's code review does it for you in minutes for a fraction of the cost. Brawloom's taunt Saunders uses an analogy. And Thropic is the new Amazon, build on our platform, and once you get scale, we will build
a basic version of your product and put you out of business instantly. It was a quote to eat from this from Verunrum, Kinesh, who wrote, "At this point, it's pretty clear that if you are an app-layer company using Cloud Code SDK, it is inevitable that anthropic sees your usage and then develops that tool in house." One of the potential reckonings in the AI space is going to be questions of power and
Consolidation around the very small number of neutron star companies that are...
everything around them.
“It is worth pointing out, of course, that this particular product is not guaranteed to”
work.
Only the teams at cognition and open AI are using this as a banana for their own marketing,
“and maybe the market will force the price of AI code review down.”
Still, it's very clear taking a step back, that the response around the code review product
was about more than just price.
“Cut to the quick of the types of issues that are just going to be part of our everyday”
in the period that's coming up next. We will of course continue to track this. I will say I have the sense, like Swix that perhaps while it takes three to six months for everyone to get there, it is very likely to meet that this debate or conversation, is kind of quaint and retrospect.
For now that is going to do it for today's AI Daily Brief, I appreciate you listening
or watching, as always, and until next time, peace.



