Anthropic Was So Concerned About Its New Mythos-Based Model's Power That It Lobotomized Its Ability to Improve Itself – Futurism

Anthropic Was So Concerned About Its New Mythos-Based Model's Power That It Lobotomized Its Ability to Improve Itself - Futurism https://indiaprimetv.com/uncategorized-en/anthropic-was-so-concerned-about-its-new-mythos-based-models-power-that-it-lobotomized-its-ability-to-improve-itself-futurism/



Sign up to see the future, today
Can’t-miss innovations from the bleeding edge of science and tech
Earlier this year, Anthropic refused to release its Mythos AI model to the public, saying it was simply too dangerous.
At the time, executives claimed the model was capable of punching through powerful cybersecurity safeguards, pointing at researchers who used it to discover thousands of vulnerabilities in widely-used open source code.
Months later, Anthropic was finally ready to go public with the model. On Tuesday, the Dario Amodei-led company announced a Mythos-powered model called Fable 5, which it claims is “safe for general use.”
However, new safeguards quickly frustrated AI researchers, who accused the company of intentionally lobotomizing Fable 5. The backlash was so fierce, Anthropic quickly made adjustments to the policy, as Wired reported on Wednesday, highlighting just how carefully the company is treading.
In its original announcement, Anthropic claimed the safeguards were designed to stop Fable 5 from improving itself, in “new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development.” Just days ahead of the launch, Anthropic released a report on “when AI builds itself,” a trend that “might increase the risks of humans losing control over AI systems.”
However, AI researchers were not impressed by Anthropic hamstringing its latest model’s abilities.
“Anthropic’s latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won’t notice,” AI research firm SemiAnalysis tweeted.
“We are already seeing Anthropic’s latest model’s moderation filters our GPU inference research and programming,” it added.
Other researchers accused Anthropic of using Fable 5 to “shadowban,” or quietly restrict the accounts, of AI researchers. According to the firm’s system card, interventions limiting requests for “frontier LLM development” will “not be visible to the user.”
This last concern, which could’ve effectively sabotaged anybody trying to train competing models by quietly bumping them down to less powerful models without their knowledge, proved controversial enough for Anthropic to change its mind.
“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” the company told Wired in a statement. “We made the wrong trade-off and we apologize for not getting the balance right.”
“It felt like Anthropic was saying to the public, ‘We don’t trust anybody else to do AI research,” AI startup Prime Intellect research lead Will Brown told the publication. “We are the only ones who have to do AI research.”
It all comes in the context Anthropic calling for a global freeze on AI advances while discussing the dangers of “recursive self-improvement.” In other words, the company is making a lot of noise about a sci-fi-sounding possibility: that AI will start to rapidly improve itself, potentially escaping the control of its human creators.
Beyond limiting its ability to develop AI tools, Fable 5’s new safeguards also trigger when it encounters requests “related to cybersecurity, biology and chemistry, or distillation.” Distillation is effectively using machine learning to train a “student” model on the behavior and reasoning of a “teacher” model, a practice that has sparked its fair share of controversy.
Anthropic has already publicly griped about large-scale attempts to distill, or “extract” its underlying model — a hypocritical stance given its indiscriminate scraping of rights-protected content on the web to train its AI in the first place.
More on Anthropic: Anthropic Scared, Calls for Global Freeze on AI Advances
I’m a senior editor at Futurism, where I edit and write about NASA and the private space sector, as well as topics ranging from SETI and artificial intelligence to tech and medical policy.





















Sign up to see the future, today
Can’t-miss innovations from the bleeding edge of science and tech
Disclaimer(s)
Articles may contain affiliate links which enable us to share in the revenue of any purchases made. Registration on or use of this site constitutes acceptance of our Terms of Service.
© 2026 Recurrent. All rights reserved.

source

Leave a Reply

Your email address will not be published. Required fields are marked *

Can KKR Outmaneuver One of the Biggest AI Infrastructure Bottlenecks? - The Daily Upside https://indiaprimetv.com/uncategorized-en/anthropic-was-so-concerned-about-its-new-mythos-based-models-power-that-it-lobotomized-its-ability-to-improve-itself-futurism/
Latest Updates

Can KKR Outmaneuver One of the Biggest AI Infrastructure Bottlenecks? – The Daily Upside

    The big money firm is teaming up with Nvidia, Kuwait’s sovereign wealth fund, and Texas energy producer Vistra to build data centers. Brian Boylebrian@thedailyupside.comIf you had told a Silicon Valley venture capitalist five years ago that the future of artificial intelligence would depend on a Texas coal-and-nuclear utility, a sovereign wealth fund from the Persian […]

    Read More
    "There Aren't Enough Women": Inside The Math That Could Break China - NDTV https://indiaprimetv.com/uncategorized-en/anthropic-was-so-concerned-about-its-new-mythos-based-models-power-that-it-lobotomized-its-ability-to-improve-itself-futurism/
    Latest Updates

    "There Aren't Enough Women": Inside The Math That Could Break China – NDTV

      “There Aren’t Enough Women”: Inside The Math That Could Break China  NDTVsource

      Read More