TrendKia
AllLiveNational
World
All World
PakistanChinaAmericaEuropeAsia
Politics
Uttar Pradesh
Uttar Pradesh
Uttar PradeshBiharMadhya PradeshRajasthanDelhiMaharashtraGujaratPunjabHaryanaWest BengalTamil NaduKeralaKarnatakaTelanganaAndhra PradeshJharkhandChhattisgarhOdishaAssamUttarakhandHimachal PradeshJammu & KashmirGoaChandigarhPuducherry
Travel
Travel
Business
MarketMoneyAutoBenefitsSuccess StoriesCryptoAI
Sports
CricketTennisFootball
EntertainmentMovies, TV & celebrities
BollywoodOTTBhojpuriMovie ReviewsTVHollywood
TechnologyGadgets, apps & innovation
AccessoriesLaunch & ReviewDIY
HealthHealth, fitness & wellness
LifestyleFashion, relationships & lifestyle
Fashion & BeautyCultureRelationshipsTrendsParenting
FoodRecipes, food & restaurants
ReligionFaith, belief & spirituality
FestivalsVastuSpirituality
Astrology
AriesTaurusGeminiCancerLeoVirgoLibraScorpioSagittariusCapricornAquariusPisces
TravelDestinations & travel guides
Travel Tips
EducationJobs, exams & results
VacanciesAdmissionExamResultsCareer
Live
National
World
Pakistan China America Europe Asia
Politics
Business
Market Money Auto Benefits Success Stories Crypto AI
Sports
Cricket Tennis Football
Entertainment
Bollywood OTT Bhojpuri Movie Reviews TV Hollywood
Technology
Accessories Launch & Review DIY
Health
Lifestyle
Fashion & Beauty Culture Relationships Trends Parenting
Food
Religion
Festivals Vastu Spirituality
Astrology
Aries Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius Capricorn Aquarius Pisces
Travel
Travel Tips
Education
Vacancies Admission Exam Results Career
Uttar Pradesh Bihar Madhya Pradesh Rajasthan Delhi Maharashtra Gujarat Punjab Haryana West Bengal Tamil Nadu Kerala Karnataka Telangana Andhra Pradesh Jharkhand Chhattisgarh Odisha Assam Uttarakhand Himachal Pradesh Jammu & Kashmir Goa Chandigarh Puducherry
About Contact Privacy Cookies Terms Advertise
TrendKia logo Hindi • English News Platform

TrendKia

Fast • Fresh • Always Trending

TrendKia is a free bilingual Hindi–English news platform — trending stories from India and around the world. Sign in with Google to comment and follow topics.

About Us
TrendKia news app preview
TrendKia
AboutContactPrivacyCookiesTermsAdvertise
Developers Are Convinced Fable 5 Got Worse After Its Return, but Two Benchmarks Tell a Stranger StoryAI
2 hours ago· 3

Developers Are Convinced Fable 5 Got Worse After Its Return, but Two Benchmarks Tell a Stranger Story

When Claude Fable 5 came back online, social media declared it broken and nerfed. The real problem isn't the model at all, but an overly aggressive safety filter sitting in front of it.

Michael AndersonMichael AndersonUS & AI Correspondent 4 min read For AI
Share

The moment Claude Fable 5 came back online on July 1, a wave of complaints hit social media. Users called it broken, nerfed, lobotomized, underperforming and simply not the same model anymore. One user wrote that he had spent the whole day carrying on the exact work he had been doing with Opus, and grumbled that politics had once again crushed everyday technological progress.

But the story is not that simple. On the same day, two different benchmarks, BridgeBench AI and Arena AI, published their data and landed on completely opposite conclusions. One found a severe drop in the quality of the outputs, while the other measured differences so tiny they might not even be worth noticing. The curious part is that, each in its own way, both of them are right.

Here is the short version: the model did not get dumber. The gatekeeper standing in front of it got a lot more aggressive. And that distinction matters enormously for you, depending on what you actually use Fable for.

What BridgeBench Actually Measured

BridgeMind, an AI evaluation platform, re-ran its full coding suite against the July 1 build of Fable 5 the very day it returned. BridgeBench tests real-world coding tasks across categories that include debugging, refactoring and resistance to fabricated information, scoring the model from 0 to 100 on how well it clears each category.

On paper, the numbers were brutal. Debugging collapsed from 86.2 to 25.9, refactoring slid from 73.6 to 38.4, and hallucination resistance fell from 75.9 to 61.7.

The real catch lies in the methodology. Of 12 TypeScript debugging tasks, only three ever actually reached Fable 5. The other nine were intercepted mid-way by Anthropic's new safety classifier and rerouted to Claude Opus 4.8. And BridgeBench scores every such fallback as a zero, because the model that answered was not the one being evaluated in the first place.

This classifier was installed as one of the conditions of Fable's reinstatement. It was trained to block the jailbreak technique that Amazon had flagged, a method that could get Fable 5 to identify and demonstrate software vulnerabilities. The classifier does work. But it also catches plenty of things it should not. Debugging TypeScript looks enough like security work to the classifier that the fallback keeps firing over and over.

What Arena AI Actually Measured

Arena AI, an LLM benchmarking and comparison platform, examined the same question through an entirely different lens. The platform gathers thousands of blind human-preference votes across several categories, namely text, vision, document, code and agent, and ranks models using Elo scoring. Elo is the same chess-derived rating system that accounts for statistical uncertainty across thousands of head-to-head matchups. When two models face off anonymously and humans pick a winner, the score reflects the quality people actually perceive, not the routing happening behind the scenes.

The before-and-after comparison showed Fable 5 largely holding its ground. Frontend code dipped from 1650 to 1623 Elo, but Arena noted that as the data keeps piling up, this gap sits within the confidence interval. Document performance improved by 34 points, expert text climbed by 25, and creative writing edged up slightly by 9. The categories that declined were coding at minus 18 and hard prompts at minus 3, which are precisely the spots where the classifier is most likely to intercept the prompt before Fable can answer.

In other words, when Fable 5 genuinely handles a task itself, it still performs like Fable 5. The frustration on X is not about a worse model, but about paying for a model that often is not the one doing the answering.

Who Will Notice and Who Won't

General users doing creative writing, document analysis, research and expert-level text queries will likely feel little to no difference. Those are exactly the categories where Arena AI shows flat or improved performance. Even if there is some improvement, it may be too small to detect, especially in subjective tasks like creative writing, where results are hard to fully measure.

So, broadly, writers, researchers and analysts will get the Fable 5 they expected. Developers are a different story. Anyone working in security-adjacent territory, such as coding memory management, or anything that brushes up against words like vulnerability, exploit, hook or even fix, is going to keep hitting the fallback.

The gap between BridgeBench's collapse and Arena's stability comes down to the type of task. BridgeBench packs its suite with exactly the kind of code-repair and debugging prompts that set off the new classifier. Arena's human voters, on the other hand, ask a far wider mix of questions, and most of them do not look like exploit code to a safety layer.

What Comes Next

Anthropic has said the classifiers will improve over time, acknowledging that they currently cast far too wide a net. The original ban came after Amazon researchers discovered a technique to make Fable identify and demonstrate software vulnerabilities, and the U.S. government treated that as a national security threat. The fix was to make the classifier conservative enough to catch that threat and everything around it, then dial it back later. Anthropic has so far given no target date for when that will happen.

What this means for you

  • For writers and researchers: Anyone doing creative writing, document analysis or research gets the same Fable 5 as before, with barely any difference to notice.
  • For developers: When you run debugging or security-adjacent coding, your prompt often gets routed to Opus 4.8 instead of Fable 5, so the model you are paying for is not always the one answering.

Questions & Answers

Has Claude Fable 5 really been nerfed?
No, the model itself did not get weaker. When it handles a task directly it still performs as before; the real issue is the aggressive safety filter sitting in front of it.
When did Fable 5 come back online?
Claude Fable 5 came back online on July 1.
Why did the BridgeBench scores fall so sharply?
Because only three of 12 TypeScript debugging tasks actually reached Fable 5; the other nine were rerouted to Opus 4.8, and BridgeBench scores every such fallback as zero.
How big was the drop on BridgeBench?
Debugging fell from 86.2 to 25.9, refactoring from 73.6 to 38.4, and hallucination resistance from 75.9 to 61.7.
Why were Arena AI's results different?
Arena collects thousands of human votes across many categories, and most of those questions do not trigger the safety filter, so Fable 5 largely held its ground there.
Why was this classifier installed?
It was put in place to block the jailbreak technique flagged by Amazon that could get Fable 5 to identify and demonstrate software vulnerabilities.
Will this problem be fixed?
Anthropic has said the classifiers will improve over time, but it has given no target date for when that will happen.
Michael Anderson
About the authorMichael AndersonUS & AI Correspondent San Francisco
ExpertiseU.S. News, Politics, Government Policy, Elections, Economy, Breaking News, Congress, White House, Social Issues, International Relations

Michael Anderson is a US & AI Correspondent covering American politics, breaking news, economy, and national affairs. He delivers timely updates and clear analysis from across the United States.

Michael Anderson is a U.S. Correspondent specializing in coverage of American politics, government policy, economy, social issues, and major breaking news events. He reports on developments from Washington D.C. and across the United States, including elections, congressional activity, White House decisions, economic trends, and key national stories. With a focus on accuracy, speed, and contextual reporting, Michael provides in-depth analysis of issues shaping the United States and its global influence. His journalism helps readers understand complex political and economic developments through clear, factual, and balanced reporting.

View full profile ↗
#AI#ClaudeFable5#Anthropic#AIModel#SafetyClassifier#BridgeBench#ArenaAI#Opus4.8#CodingBenchmark

Comments 0

Sign in to join the conversation.

Sign in

No comments yet — be the first.

Three Indian Sailors Killed in Gulf of Oman Strike: Shashi Tharoor Tears Into US Over 'Insensitive' Statement, Presses Jaishankar TooPolitics1
Three Indian Sailors Killed in Gulf of Oman Strike: Shashi Tharoor Tears Into US Over 'Insensitive' Statement, Presses Jaishankar Too
Wall Street's Big Bet on AMZN: Where Could Amazon Stock Land Between 2026 and 2028?Market2
Wall Street's Big Bet on AMZN: Where Could Amazon Stock Land Between 2026 and 2028?
FCC's 'Know Your Customer' Plan Could End Anonymous Phones — Plus the Week's Biggest Breaches and BustsSecurity3
FCC's 'Know Your Customer' Plan Could End Anonymous Phones — Plus the Week's Biggest Breaches and Busts

Latest news straight to your inbox

The day's big stories, in one email.

TrendKia बाज़ारAdvertisementमानसून सेल — हर चीज़ पर 50% तक छूटTrendKia बाज़ारअभी खरीदें →
Citizen journalism

Become a TrendKia journalist

Voice of the people

Share news, photos and videos from your area with TrendKia and let your voice reach the nation. Every citizen a journalist.

Join now
Citizen journalistCitizen journalist
Citizen journalist
Citizen journalist

Related stories

Washington May Land a 5% Piece of OpenAI in a Deal Worth About $42 BillionAI
Washington May Land a 5% Piece of OpenAI in a Deal Worth About $42 Billion
1 day ago
SpaceX Is Buying Cursor, and Now the AI World Wonders If Rivals Will Yank Their ModelsAI
SpaceX Is Buying Cursor, and Now the AI World Wonders If Rivals Will Yank Their Models
1 day ago
Meta's Smart Glasses Now Come With a Subscription Catch, and Rivals Are CirclingAI
Meta's Smart Glasses Now Come With a Subscription Catch, and Rivals Are Circling
2 days ago
Venice AI Hits $1 Billion Valuation With $65 Million Raise as Voorhees Pushes Private AIAI
Venice AI Hits $1 Billion Valuation With $65 Million Raise as Voorhees Pushes Private AI
2 days ago
AI-Generated Influencer Accounts Are Secretly Funneling Gay Men Toward a New Dating App Called GooseAI
AI-Generated Influencer Accounts Are Secretly Funneling Gay Men Toward a New Dating App Called Goose
2 days ago
The Chinese Model That Ran Undercover on OpenRouter for Two Months Finally Has a Name, and a Rock-Bottom PriceAI
The Chinese Model That Ran Undercover on OpenRouter for Two Months Finally Has a Name, and a Rock-Bottom Price
2 days ago
Researchers Launch a Crowdsourced Hub Where Anyone Can Report AI Gone WrongAI
Researchers Launch a Crowdsourced Hub Where Anyone Can Report AI Gone Wrong
2 days ago
Washington Clears the Path for Anthropic's Most Powerful AI Systems to Reach Wider ReleaseAI
Washington Clears the Path for Anthropic's Most Powerful AI Systems to Reach Wider Release
3 days ago