\

Claude Fable 5

754 points - today at 4:58 PM

Source
  • eggbrain

    today at 5:08 PM

    For those of us on subscription plans:

    * From today through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost.

    * On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits. If capacity allows, we’ll extend the included window.

    * After this point—when sufficient capacity allows us to do so—we aim to restore Fable 5 as a standard part of subscription plans. We intend to do this as quickly as we can.

    The "offer, then remove" aspect is a bit eyebrow-raising -- it feels like they are trying to get subscribers to switch to usage-based billing, which makes me wonder if we'll ever get it after that June 22nd window.

      • jrflo

        today at 5:24 PM

        Still satisfied with my switch to codex/chatgpt. I couldn't imagine switching away from claude code when it first launch but with the drastically more generous usage on codex for the same subscription tier I just can't justify it.

          • sigbottle

            today at 6:01 PM

            Codex IME is just smarter, I think it shows given both anecdotes but also how OpenAI has always been at the front of programming competitions and math problems.

            But Claude models seem to be better at long term problems or more ambiguous problems.

            I'm curious as to what the primary benefit here. Are there secret improvements in training? There hasn't been much in fundamental model architecture, I don't think. What about harnesses? I wonder what's pushing the AI. It seems like harnesses is the main thing pushing AI ever since CoT.

              • Spartan-S63

                today at 6:19 PM

                I find that OpenAI's agentic tools and models are better for building human-maintainable software. Meanwhile, Anthropic seems to be cosplaying Apple while missing out on all the exceptional engineering required to create something that polished. Their admission of predominately using Claude with little human oversight and their stealth mode is an indictment of a poor engineering culture, from what I can surmise.

                • greenavocado

                  today at 6:27 PM

                  How smart a model is varies hour over hour, tracked over here: https://aistupidlevel.info/

              • rvshchwl

                today at 6:23 PM

                I've found Codex to be the better subscription for OpenClaw, because the limits are indeed very generous. However, I've found more and more that Claude Routines/Scheduled agents can replace all the tasks I use OpenClaw for, so I've been slowly switching over to Claude Code. Aside from OpenClaw, I don't find a lot of value in Codex as a harness on it's own.

                • wsatb

                  today at 5:32 PM

                  I guess enjoy it while it lasts? OpenAI won't be able to subsidize that forever either.

                    • windexh8er

                      today at 6:20 PM

                      Agreed. I think the Chinese labs are proving that OpenAI and Anthropic don't have a moat in almost every aspect, especially pricing. I also think people are getting annoyed with the constant lift and shift. I've seen more folks drop Claude Code and Codex, specifically, because of the lock-in it provides the providers. I'm curious to see how people standardize on tooling adjacent and if Anthropic, Google or OAI move to block utilization akin to the games Anthropic has been playing as of late.

                      I think the end game is routed model usage and SLMs. I think Apple is going to prove this in the consumer space pretty handily and I'm curious how the Android ecosystem responds since the hardware is considerably lacking in model performance. I think Apple has a huge opportunity here, as much as I don't like their current ecosystem of walled garden. They did position themselves very well with ARM and custom chips for their hardware. Hopefully the broader ecosystem of ARM and Linux are able to make some headway and we see a more formalized, and broadly accepted, architecture to capitalize on.

                      • flatline

                        today at 5:45 PM

                        I don't think anyone has a firm grasp on actual inference costs -- including the research and training that has gone into those models. We've got near-frontier capabilities from open source models from China at pennies on the dollar compared to US big tech rollouts. OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?

                          • andrewmutz

                            today at 6:03 PM

                            Both can be true. They can be charging what the market will bear, and still be charging less than their costs of running it.

                              • wyre

                                today at 6:30 PM

                                [dead]

                            • dontlikeyoueith

                              today at 6:25 PM

                              > OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?

                              Both. They are charging the most they can get away with and that amount is still heavily subsidized by VC capital.

                              • MichaelMedbed

                                today at 5:51 PM

                                [flagged]

                                  • kllrnohj

                                    today at 6:04 PM

                                    regardless of whether that's true or not, US companies doing hosted inference of the models coming out of China are also significantly cheaper than those from OpenAI or Anthropic

                                    • polski-g

                                      today at 5:57 PM

                                      Not relevant to the post.

                              • ChrisMarshallNY

                                today at 5:51 PM

                                I'm planning on switching from the $20/month to the $100/month plan.

                                It's worth it, and I can afford it, but I am not really the right type of user for token-based usage. It's all for personal and free work.

                                  • rnxrx

                                    today at 6:26 PM

                                    I have the $100 plan and had almost never run out of credits until I started using the ultracode / workstreams feature w/Opus 4.8..at which point I managed to blow the full 6 hour allocation in like 20 minutes, or so. In fairness, it did some amazing things with the extracted information, but it also strongly suggested that I'd need the $200 subscription *plus* a budget for extra usage.

                                    • micah94

                                      today at 6:17 PM

                                      Just a personal anecdote but I have not hit any more thresholds or limits since switching to the MAX plan and so far, it's been worth it. But I do wonder how long even this will last...

                                        • ygjb

                                          today at 6:29 PM

                                          I think subscription models are sustainable, but longer term, we should probably expect to see more prompt optimization happening in the providers inference pipeline. For example, unless you explicitly tell the agent or API to use a specific model, fronting the inference layer with a caching prompt classifier to determine which model to use, and automatically select the lowest cost model would probably already save alot of money (IDK if Claude/OpenAI do this on the backend, but several services I have worked on do some things like this to reduce costs of delivery customer facing inference at scale).

                                  • andai

                                    today at 6:05 PM

                                    A few weeks ago they massively cut usage on free tier.

                                • knuckleheads

                                  today at 5:51 PM

                                  I feel like Codex made a big push to run everything on your laptop. With Claude, I get 4 cpu's, a fair amount of ram and 30gb for every one of my dumb ideas for free in the cloud containers. Codex used to be similar, but last time I tried it just kept pushing me to run it locally on my laptop, which I really did not want to do with 20 requests going at once. That's the main advantage for me at the moment.

                                    • simjnd

                                      today at 6:00 PM

                                      What runs in cloud containers? The dev servers, builds, etc.? I tried to quickly glance at the Claude website and it doesn't mention cloud containers on their pricing page.

                                  • dd8601fn

                                    today at 5:50 PM

                                    I have trouble justifying gpt after that gross stuff with the war department.

                                    Though the day is coming when there’s no distinguishing, I’m sure.

                                    • shimman

                                      today at 6:00 PM

                                      I've only ever had the $20 month claude plan but last night took the time to setup opencode + openrouter paying for deepseek + glm. Previous experience, while extremely awkward, I'd hit my limit within one or two chat replies and it'd take me like 4 limit cycles to complete my task. Now I'm able to complete an equivalent task entire task for less than $2 in two cycles (ask -> revise).

                                      I'm doing basic web development here utilizing animejs. Nothing too complicated (mostly saving time doing the scaffolding, still write the bulk of animations manually).

                                      Truly believe that American companies are going to get completely curb stomped by China due to greed, ineptitude, and violating the social contract.

                                        • simjnd

                                          today at 6:03 PM

                                          I've switched from OpenRouter to using Deepseek directly from their platform since OpenRouter providers were pretty flaky and inconsistent.

                                          Deepseek V4 Flash is suprisingly capable and insanely cheap. It takes so much to get the session cost to get to $0.01.

                                          • nozzlegear

                                            today at 6:09 PM

                                            > and violating the social contract.

                                            I agree with you on pricing, but what do you mean by this?

                                              • shimman

                                                today at 6:31 PM

                                                Sure, modern American corporations care more about hoarding wealth rather than helping build up US society. Once neoliberalism became the mainstay economic position of the US income inequality has skyrocketed, healthcare costs have increased, childcare is more expensive than university, housing has become both unaffordable + unobtainable. By simply existing costs have increased while life becomes unstable.

                                                Why aren't corporations doing more to help workers with childcare? Why aren't they doing more profit sharing with workers? Why aren't they encouraging unions or sectorial bargaining? Why isn't the government mandating any of this?

                                                Americans very rarely benefit when US corporations do well. That needs to change. No one benefits if Meta continues making billions in profit every quarter while society suffers from isolation, depression, suicide, and scams from their services. Americans don't benefit if health insurance companies are making massive profits while they can't afford deductibles.

                                                Our society has been setup to simply extract wealth in all facets of life. That's a sick society and it needs to change.

                                                I'm not saying China does this better, in fact China has some of the worse worker rights out of all the industrialized countries; but at least American consumers would benefit from cheaper higher quality Chinese goods. The world would likely benefit too if America got off the cold war hype train that did nothing to benefit humanity outside of those making weapon systems.

                                    • joshstrange

                                      today at 6:20 PM

                                      I would not use this if you are on a subscription. In <8min it burned my entire 5hr window (which has just reset it appears, I have over 4 hours till it resets) I hadn't used CC at all today aside from this) and then it used up ~$15 more in usage before I could stop it.

                                      I am on the $100 Max plan.

                                      • 0erofootprint

                                        today at 5:44 PM

                                        For me it almost immediately blocked. I had it writing code related to message digests - and it seemed to think it was too gifted for that. Gave the security warning and switched back to 4.8. Whatever... it will probably soon have the API error soon. I have mostly switched to the Codex 200 a month plan. I've found their 5.5 xhigh to be better than Opus 4.8 "ultracode." Also, i have not once seen their servers fail for compute unavailability, unlike Anthropric which happens almost ever hour.

                                          • kkoncevicius

                                            today at 5:51 PM

                                            I had a similar experience. I wanted to test it by asking it to summarise a scientific OMICs-related paper. It gave a warning about me potentially developing a bio-weapon or something like that. And switched back to Opus 4.8.

                                        • irthomasthomas

                                          today at 6:27 PM

                                          This is just the sales team doing their thing, applying the Law of Scarcity to drive demand.

                                          It's the same exact speed as opus >=4.5, sonnet 4.5, and twice the speed of opus <=4.1

                                          It must have about the same active parameters, or else its a larger model running in turbo mode (smaller batches) and being heavily subsidized for some reason. But given most of the benchmarks are within 5% I doubt it is a much larger model. Most perplexing.

                                          • smith7018

                                            today at 5:54 PM

                                            Fwiw it's not available on my enterprise account: "Disable zero data retention to unlock Fable 5 access"

                                              • stronglikedan

                                                today at 6:21 PM

                                                We just blocked it at our org for this reason. They will "retain agent request and output data associated with this model, regardless of you Cursor Privacy Mode setting."

                                                • sdellis

                                                  today at 6:21 PM

                                                  What does "zero data retention" mean? What kind of data does it need to unlock?

                                                    • drakythe

                                                      today at 6:27 PM

                                                      The announcement details it. They're storing 30 days of data on all surfaces, first and third party. They claim it is for security purposes so they can review and check for long term jailbreak and distillation efforts.

                                                      They also, FWIW, say that they've instituted new policies on their end such as logging any human access to the stored data and automated deletion after 30 days in "most" cases (with another link to a document detailing that further).

                                              • a-dub

                                                today at 6:30 PM

                                                the claimed inference cost is 2x. if that is true, it is massive and remarkable that they're able to do anything like this at all.

                                                • alvis

                                                  today at 5:15 PM

                                                  It’s too obvious that antropic need to find way to earn enough revenue before IPO. Claude subscription isn’t earning earning much money I bet

                                                    • sdellis

                                                      today at 6:26 PM

                                                      That's a big problem for all of the AI companies. Most people don't find the technology compelling, accurate, or ethical enough to pay for a subscription.

                                                      Why wouldn't Anthropic just wait until people start subscribing, do some kind of marketing push, or obtain some kind of other sustainable revenue stream, before they go IPO? I wonder if they see the writing on the wall with all of this and want to cash out as quickly as possible?

                                                      • sigmoid10

                                                        today at 5:21 PM

                                                        I think they are just prioritizing enterprise customers, because this is were historically they made most money.

                                                          • dylandevelops

                                                            today at 6:24 PM

                                                            I agree with you here. Unfortunately, this tends to be the case, with smaller developers paying the price.

                                                        • AtlasBarfed

                                                          today at 6:24 PM

                                                          That's not how it works. They don't need revenue, they need addicts.

                                                          Specifically they need businesses that fired people and adapted their business to the products, so when the unsubsidized costs hit the businesses are forced to eat the true costs.

                                                          Yes they can't afford to give the products for free, but what is essentially happening with AI services is economic dumping, keep costs artificially low to get people to fire everybody, and then Jack the rates once they have Monopoly control

                                                          • today at 5:18 PM

                                                        • lisperforlife

                                                          today at 6:23 PM

                                                          My guess is that it is a massive model similar to GPT 4.5 and $10/$50 pricing is for its output will discourage people from using it. I also read safety = nerfed.

                                                          • nickandbro

                                                            today at 5:12 PM

                                                            Get them addicted then cut them off. Oldest trick in the book.

                                                              • toomuchtodo

                                                                today at 5:15 PM

                                                                More of a free trial to those authenticated and qualified with existing payment. Subscription billing is going away for sure though eventually based on the economics. Token “all you can eat” is a capital furnace otherwise.

                                                                (I’m highly confident open models will eventually achieve a similar performance benchmark with distillation over time)

                                                                  • CuriouslyC

                                                                    today at 5:24 PM

                                                                    Subs lose money on individuals to get those individuals to force their companies to pay for the corporate plan. The economics are bad, but so are the economics of grocery stores selling Milk and Bananas at a loss to drive traffic, which they basically ALL do.

                                                                      • HDThoreaun

                                                                        today at 5:43 PM

                                                                        I havent seen any evidence showing that subscriptions cost the labs money.

                                                                        • toomuchtodo

                                                                          today at 5:24 PM

                                                                          Companies don’t want to pay when the value realized does not exceed the cost.

                                                                          AI Savings Misses 'Should Be Making Executives Uncomfortable,' Bain Says - https://news.ycombinator.com/item?id=48359010 - June 2026 (0 comments)

                                                                          AI sticker shock hits corporate America- https://news.ycombinator.com/item?id=48307098 - May 2026 (146 comments)

                                                                            • CuriouslyC

                                                                              today at 5:31 PM

                                                                              What's the realized value of not losing your engineers because you're letting them use their preferred tools?

                                                                                • today at 5:37 PM

                                                                                  • toomuchtodo

                                                                                    today at 5:32 PM

                                                                                    Retain and hire the engineers who don’t require heavy use of AI to deliver value? The current SWE job market speaks for itself. Where will you go where they will let you burn up tokens in a high cost of capital macro?

                                                                                    ZIRP (zero interest rate policy) is over, software engineers no longer call the shots now that there isn’t vast amounts of capital chasing yield, and that capital bidding up salaries and keeping the labor market for engineers tight.

                                                                                    If you are x more productive with generative AI, very shortly you are going to have to prove it with a token budget (or, if you’re lucky, an org willing to spend for on prem hardware for capped token cost, fixed capex vs uncapped opex).

                                                                                    The comparison is not SWE vs SWE with AI. It is SWE vs SWE with AI with a constrained token budget ($x/month) delivering the same value at the same or lower cost. If you cannot prove that you are wildly (vs marginally) more productive with the AI, why would they pay for it? Prove it.

                                                                • xpct

                                                                  today at 5:10 PM

                                                                  I agree, this looks like their plan to wane out subscriptions. This will probably come with Opus nerfs later.

                                                                    • rapind

                                                                      today at 5:18 PM

                                                                      I just assume Opus is constantly nerfed based on capacity. I was exclusively Claude for a long time, but the inconsistency in quality, constant outages, and slow downs were too hard to work with.

                                                                      I just use dumb and fast models now. I'm more engaged. I think that the higher the quality of the model, the more you tend to vibe with it, and then the more hallucinations you then miss. I'm not sure which is more productive, but I definitely burn out faster the more I vibe. At some point you're spending your time on forums, discord, or youtube instead of engaged with what you're building. Or you yak shave about your tooling and end up creating the 600th multi-agent gastown harness and blowing thousands of dollars on tokens to create it only to discover it's too expense to actually use.

                                                                        • dylandevelops

                                                                          today at 6:26 PM

                                                                          I agree with you. The more I vibe code, the less interested I feel in what I'm building. Working with models that force me to think, especially with personal projects, helps me stay engaged and enjoy what I am doing more.

                                                                          • winter_blue

                                                                            today at 5:27 PM

                                                                            Composer 2.5 Fast that Cursor is giving away for very little has been amazing.

                                                                            • aplomb1026

                                                                              today at 5:36 PM

                                                                              [flagged]

                                                                          • nonethewiser

                                                                            today at 5:15 PM

                                                                            It's possible that they will transition to usage credits but why not take them at their word? To date they have continued to offer better and better models to their subscription plans.

                                                                              • timcobb

                                                                                today at 5:20 PM

                                                                                What's their word? Have they commented?

                                                                                Upd: I meant big picture, not with respect to this model release. Where do subscriptions figure into their strategic vision. Will consumers end up paying enterprise prices in the future?

                                                                                  • KyleJune

                                                                                    today at 5:34 PM

                                                                                    In the blog post they say when sufficient capacity allows them to do so they aim to restore Fable 5 as a standart part of subscription plans and intend to do so as quickly as they can.

                                                                                    • dbbk

                                                                                      today at 5:37 PM

                                                                                      Read it again

                                                                                        • timcobb

                                                                                          today at 5:42 PM

                                                                                          I did, I'm not seeing anything about the future of subscriptions at Athropic.

                                                                                      • ls612

                                                                                        today at 5:34 PM

                                                                                        In TFA they say they intend to restore Fable 5 to subscription plans some time after June 22. That is what "take them at their word" means.

                                                                                • taormina

                                                                                  today at 5:13 PM

                                                                                  Those already landed! Oh, you weren't talking about 4.8?

                                                                                    • piva00

                                                                                      today at 5:15 PM

                                                                                      Even Opus 4.7 felt like a regression from 4.6, consumed a lot more tokens while I didn't experience any substantial improvements. The company I work at simply rolled back to 4.6 on everyone's configurations, disabling the toggle for 4.7.

                                                                                        • taormina

                                                                                          today at 5:38 PM

                                                                                          4.6 has been my happy place for getting anything done for a while now.

                                                                                  • xvector

                                                                                    today at 5:16 PM

                                                                                    HN needs to take a chill pill. Could it be that Mythos is expensive and they just want to give people a taste of it? I mean the alternative is not offering it at all?

                                                                                      • 8note

                                                                                        today at 5:56 PM

                                                                                        its unclear how they can offer it broadly but only for half a month.

                                                                                        why do they have capacity now that they wont in a few weeks?

                                                                                          • losvedir

                                                                                            today at 6:21 PM

                                                                                            Break between training runs?

                                                                                            • bigtechennui

                                                                                              today at 6:19 PM

                                                                                              It’s offered broadly after, for more money. It’s subsidized as marketing

                                                                                  • timcobb

                                                                                    today at 5:18 PM

                                                                                    Ooof so are we thinking that in the next 6-12 months subscriptions will be replaced with paying retail like enterprise currently?

                                                                                      • thewebguyd

                                                                                        today at 6:00 PM

                                                                                        I certainly hope not. PAYG is not predictable enough for smaller companies or individuals. Where I work (non-tech company), PAYG would never fly. We aren't big enough for that. Of course, you can set usage budgets, but there's a pretty big difference between $200/user/month vs. the equivalent PAYG usage being closer to $1,000/user/month, if you currently use the subscription plan to its limits each week.

                                                                                        Going PAYG only will effectively take these tools away from a huge amount of people and accelerate the push for local LLMs.

                                                                                        OTOH, accelerating the push for local LLMs would also be fine with me.

                                                                                        • aseipp

                                                                                          today at 5:45 PM

                                                                                          They almost certainly already make a fuckload more money off API pricing than they do subscriptions, even if there might be more total subscription users. So offering subscriptions even at some loss is probably going to continue. Honestly, I'd be surprised if they even lost money on most subs; there are definitely Token Whales out there who mess up all the accounting up, though.

                                                                                          Realistically I think Anthropic just has insane demand but finite capacity to run models, and Fable will just make them more money if they dedicate it to API pricing. I suspect the goal here is something like: get individual engineers/PMs on their personal plans to taste Fable and then go to their meetings and say "Yes doubling the price of every single input/output token is a good idea, boss".

                                                                                            • timcobb

                                                                                              today at 6:16 PM

                                                                                              But I don't want to be the developer who goes and says we must pay all this money for these tokens. I don't know who wants to be that developer.

                                                                                          • CuriouslyC

                                                                                            today at 5:23 PM

                                                                                            I don't think they'll phase out subscriptions ever, their whole play has been to drive demand from the bottom up. Get engineers hooked on building with claude at home, then get them to demand the ability to use it at work, and bend over their employer with no lube.

                                                                                            They'll probably tighten the quotas to reign in whales though.

                                                                                            • ygjb

                                                                                              today at 5:55 PM

                                                                                              I doubt it, given the importance of those subscriptions for building and maintaining market awareness.

                                                                                              The AI landscape is changing rapidly, and with Apple announcing the option to change the AI backend, and potential requirements enable AI choices as well, similar to EU browser choice requirements (this is more reading tea leaves than any actual requirements I am aware of). The new OS changes coming to support Googlebook, and deep Copilot/AI integration into Windows will make maintaining user facing subscriptions essential for independent model developers like OpenAI, Anthropic, and Mistal to remain relevant longer term.

                                                                                              If the don't maintain that relevance there is increasing likelihood that they will get consumed by other companies whether it's Apple, Microsoft or Google to form a foundation for their OS, or other cloud providers.

                                                                                                • timcobb

                                                                                                  today at 6:05 PM

                                                                                                  That make sense, but what about the specific bifurcation we're seeing here of super primo models versus still good models being available to subscriptions?

                                                                                                  It's kind of annoying not getting access to the primo model and paying 200 bucks a month. I understand 200 bucks a month is basically nothing though.

                                                                                                  Like I don't totally understand why they'd let me have it for a couple weeks and then take it away and say I can have it but I have to pay retail and retail is like $1,000 a day.

                                                                                                  It's better to have loved and lost than to have never loved at all??

                                                                                                    • ygjb

                                                                                                      today at 6:23 PM

                                                                                                      It's a trade-off. Every hyperscaler is buying and building compute capacity as fast as they can dodge red tape. There is limited compute capacity, and scarcity is a real thing.

                                                                                                      As a consumer I can choose to buy subscriptions to a range of things, including $5 droplets or VMs on a broad range of cloud hosting providers. I can even buy cheap bare metal at a bunch of providers at an affordable retail rate.

                                                                                                      I can also buy "unlimited" AI packages that will be optimized to fit the cost model from a variety of services, with different impacts, such as rolling outages when I consume a daily or hourly allotment.

                                                                                                      Right now VC and the investor class are subsidizing the rapid evolution of the services and availability, but that VC is running out. In more traditional economies, AI would have developed and rolled out more slowly, and through metered subscriptions, with the eventual rolling out of "unlimited" packages like telephone, internet, or cell services once the market became commoditized.

                                                                                                      We have seen a big inversion of that with the race to "win" AI marketshare. Now the true cost is being exposed, and the most competitive and capable models are hideously expensive to operate, so it makes sense that we are moving to metered billing for a utility service. If you want gas, you can buy regular or premium. If you have a premium car you definitely want the premium, but for most people regular is good.

                                                                                                      Give it a couple of years, and the survivors will settle around fairly industry standard models of consumer grade services, pro-sumer accounts, and business/enterprise models.

                                                                                                      Things are still shaking out, but I get the sadness. Luckily I work at a big tech company who is banging the drum on doing experimentation so I use my prosumer claude pro and other accounts at home for hobby stuff, and save my heavy lifting and potentially experimentation for work :P

                                                                                          • matheusmoreira

                                                                                            today at 6:04 PM

                                                                                            This is really sad... I really didn't want to be priced out of these models but it looks like that's going to happen sooner rather than later.

                                                                                            • nicce

                                                                                              today at 5:51 PM

                                                                                              > The "offer, then remove" aspect is a bit eyebrow-raising -- it feels like they are trying to get subscribers to switch to usage-based billing, which makes me wonder if we'll ever get it after that June 22nd window.

                                                                                              Probably all about the IPO.

                                                                                              • daft_pink

                                                                                                today at 6:09 PM

                                                                                                I’m just about ready to cancel my small business 5 user plan with max licenses, because although cowork is really great. I just find OpenAI/Codex to be a lot better most of the time.

                                                                                                • kyledrake

                                                                                                  today at 5:21 PM

                                                                                                  Considering their apparent nerfing of the end user plans in favor of enterprise clients, is Anthropic still the "more ethical AI company" like everybody loves to tell me all the time?

                                                                                                  Assuming this isn't just a supply issue on their side, nothing says "ethical AI" like only allowing mega corporations to use it through cost barriers.

                                                                                                    • estearum

                                                                                                      today at 5:24 PM

                                                                                                      You really misunderstand what AI-doom people are worried about if you think this is anywhere near the top (or middle, or bottom) of the list of concerns.

                                                                                                        • Jackson__

                                                                                                          today at 5:46 PM

                                                                                                          If you can't trust them to act ethically on the small scale, why would you expect that to turn around once it gets to a larger much more important scale?

                                                                                                          How many government sanctioned school bombings does it take for them to quit working with said government? For now we know that number is somewhere between infinity and 1.

                                                                                                            • estearum

                                                                                                              today at 5:52 PM

                                                                                                              It literally does not register as "unethical" at any scale to have different products or prices for different customer tiers.

                                                                                                              The question of collaboration with USG is a much more complex one, but is not the one raised above.

                                                                                                          • throwaway894345

                                                                                                            today at 5:32 PM

                                                                                                            Yeah, it's positively precious to think the specific pricing strategy for consumers is the overriding ethical concern with OpenAI, etc. I don't have any particularly strong affinity to any AI company, but comparing pricing to say mass surveillance is ... something else.

                                                                                                              • kyledrake

                                                                                                                today at 5:33 PM

                                                                                                                Your beautiful straw man is negated by the fact that Anthropic seems quite eager to get back on the DoD gravy train https://www.reuters.com/business/aerospace-defense/blacklist...

                                                                                                                  • ygjb

                                                                                                                    today at 5:47 PM

                                                                                                                    Setting aside the simple fact that there is no ethical consumption under capitalism, the reality is that regardless of how Anthropic feels, it is becoming clear that many, if not all countries regard AI developments as strategic technologies (and they should).

                                                                                                                    Anthropic needs to be at least somewhat in the good graces of a capricious administration that is already under pressure from businesses and citizens to regulate AI companies across multiple different domains, whether it's energy consumption, job displacement, military and defense applications, surveillance, etc.

                                                                                                                    If Anthropic wants to survive, they need to acquire influence with the government that most impacts them as an American company, and a massive exporter of services in the AI space to other countries, otherwise they could get locked down and locked out of the market for national security reasons.

                                                                                                                    It sucks, but sometimes the survival choice is to make an ethical compromise in hopes that you can still be around to make better decisions later.

                                                                                                                      • ericmay

                                                                                                                        today at 5:55 PM

                                                                                                                        > Setting aside the simple fact that there is no ethical consumption under capitalism

                                                                                                                        This "simple" fact needs quite a bit of additional context and work. Making grandiose ethical claims like this can be countered with other grandiose claims such as the fact that there is no ethical existence under communism or socialism.

                                                                                                                          • ygjb

                                                                                                                            today at 6:15 PM

                                                                                                                            Sure. Why not, I'm bored today and waiting for some stuff to finish up :D

                                                                                                                            The fact that there is no ethical consumption under capitalism is not material to whether or not ethical existence is possible under communism or socialism. In order to survive in a capitalist society, one inherently has to make choices that require trade-offs, and those trade-offs are burdened by a history of decisions made not just by the people alive today, but our ancestors as well. Does that mean I walk around chanting "Reparations", "Land-back", or other calls to action? No, but I do acknowledge that there are unresolved issues and as a Canadian, I know we need to do more to resolve treaty issues, and environmental issues, and system discrimination. I also know that Americans need to do better to address systemic discrimination and many, many other issues. It also doesn't mean I want to give back my house, or give away all of my possessions. It just means I try to make good choices and support businesses and people that are open about the trade-offs they make and try to engage as ethically as possible.

                                                                                                                            Acknowledging those facts doesn't absolve us of responsibility, it's a framework that allows folks concerned about whether or not they are doing the right thing to accept the trade-offs that they choose to make and be responsible and accountable for those choices to themselves or their communities.

                                                                                                                            We live in a world with scarce resources. It's possible that with a foundational redesign of the global economy, and the requisite authoritarian government that would be required to force such a redesign, we could eliminate food scarcity, solve energy scarcity, and make sure that everyone has a place to live. Those trade-offs are probably not worth the ethical cost in political and physical violence required to accomplish it. We have seen the trade-offs that happen when the powerful are able to exploit communist or socialist governments. We are seeing the "late stage capitalism" impacts of allowing the powerful to exploit capitalism in democratic societies. Acknowledging that the current capitalist system has lead to the greatest prosperity for the upper echelon (financially) of humanity, and a dramatic reduction in global poverty shouldn't obscure the reality that much of that wealth comes from exploitation of people and the environment.

                                                                                                                            It's a huge problem to unwind, and we can't let the burden of every choice that we make stop us from trying to do better, but we (as in society in general) can't do better if we don't at least acknowledge the compromises we are making along the way, and try to plan to fix it in the future.

                                                                                                                            Probably a topic better suited to beer and a pub setting than HN though :P

                                                                                                                    • estearum

                                                                                                                      today at 5:39 PM

                                                                                                                      Where is your evidence that this is Anthropic backtracking on its ethical and contractual commitments rather than DOD backtracking on its blatantly illegal coercion (which it's almost certainly going to be successfully sued for)?

                                                                                                                      Talk about a strawman!

                                                                                                                        • kyledrake

                                                                                                                          today at 5:46 PM

                                                                                                                          As someone that was in Minneapolis during the ICE raids, including one where a US citizen at a nearby restaurant was thrown in prison for 3 days despite having his passport on hand because he looked asian, it's hard for me to not equivocate the ethics of AI companies actively collaborating with the Trump administration as different flavors of ice cream.

                                                                                                                            • estearum

                                                                                                                              today at 5:49 PM

                                                                                                                              Are the two analytical frameworks available to you just "black and white thinking" or "it's different flavors of ice cream?"

                                                                                                                                • kyledrake

                                                                                                                                  today at 6:02 PM

                                                                                                                                  Are the personal attacks really necessary to make your argument?

                                                                                                                                    • estearum

                                                                                                                                      today at 6:08 PM

                                                                                                                                      Fair point! Edited to remove.

                                                                                                          • DonsDiscountGas

                                                                                                            today at 5:47 PM

                                                                                                            I don't think offering a product under a certain set of terms obligates a company to maintain that offering forever. The bait and switch is certainly annoying but seeing as they're very upfront about it you can't say you weren't warned. Don't like it? Don't use it.

                                                                                                            • brianmcnulty

                                                                                                              today at 5:22 PM

                                                                                                              Why would you have ethics when you could get that IPO money instead?

                                                                                                              • wongarsu

                                                                                                                today at 5:32 PM

                                                                                                                I wouldn't call Anthropic ethical. But between Anthropic and OpenAI, Anthropic is the more ethical one

                                                                                                                • today at 5:52 PM

                                                                                                                  • Maken

                                                                                                                    today at 5:22 PM

                                                                                                                    The bar is just too low.

                                                                                                                    • fridder

                                                                                                                      today at 5:24 PM

                                                                                                                      More ethical in some areas, actively user hostile in others

                                                                                                                      • xvector

                                                                                                                        today at 5:24 PM

                                                                                                                        Yup - who cares about x-risk or red lines for domestic mass surveillance anyways? I draw my red lines at prioritizing profitable customers when heavily resource constrained. That's the true definition of evilness!

                                                                                                                    • Aleleo76

                                                                                                                      today at 5:49 PM

                                                                                                                      Pay-as-you-go billing is a kind of drug, I use it every now and then when I'm working on a project with Opus, in a moment you spend a fortune

                                                                                                                      • DonsDiscountGas

                                                                                                                        today at 5:44 PM

                                                                                                                        I expect that depends on demand, feedback, and whether GPT-6.0 gets released and is competitive

                                                                                                                        • ABS

                                                                                                                          today at 5:18 PM

                                                                                                                          also: Fable takes 2Ă— the usage of Opus

                                                                                                                          • clementg

                                                                                                                            today at 5:21 PM

                                                                                                                            I really don't want this to start being the norm

                                                                                                                              • baggachipz

                                                                                                                                today at 5:23 PM

                                                                                                                                I don't see how it won't be. They lose insane amounts of money on subscription plans. I'm sure they still lose money on usage-based billing, but probably not as much.

                                                                                                                                  • JumpCrisscross

                                                                                                                                    today at 5:28 PM

                                                                                                                                    > They lose insane amounts of money on subscription plans

                                                                                                                                    Do we know this? I’ve seen evidence they lose money on heavy users. But so do gyms.

                                                                                                                                      • saaaaaam

                                                                                                                                        today at 5:34 PM

                                                                                                                                        How do gyms lose money on heavy users? A heavy gym user isn’t really costing the gym anything extra as far as I can see.

                                                                                                                                          • JumpCrisscross

                                                                                                                                            today at 5:36 PM

                                                                                                                                            > How do gyms lose money on heavy users?

                                                                                                                                            Most gyms sell more subscriptions than they can fit under their roof at one time. If a gym only sells to heavy users, it will either be constantly turning members away or have to buy more equipment. Its equipment will wear off faster. Depending on amenities, it will go through towels, soap, water, et cetera faster, too.

                                                                                                                                              • tripleee

                                                                                                                                                today at 5:57 PM

                                                                                                                                                Gym equipment lasts 10+ years in a commercial gym, at $50/mo that's a minimum of $6k paid from a single person.

                                                                                                                                                Unless they're really, seriously wasteful with the soap.. there's no chance a gym is losing money on a heavy user

                                                                                                                                                  • rafram

                                                                                                                                                    today at 6:19 PM

                                                                                                                                                    It depends on the gym and their business model! A super-budget gym like Planet Fitness that charges $15/month is going to lose money on heavy users, but they count on most of their members being infrequent gym-goers. A luxury gym like Equinox that charges $300/month can target heavy users without any issues, and they'd actually rather members go more so they stay and spend money on expensive salads and smoothies.

                                                                                                                                                    Right now all these AI subscriptions are priced like Planet Fitness, but they're used like Equinox. They're hoping that the new a la carte offerings will move their pricing more in that direction as well.

                                                                                                                                        • charcircuit

                                                                                                                                          today at 6:12 PM

                                                                                                                                          >I’ve seen evidence they lose money on heavy users.

                                                                                                                                          Where?

                                                                                                                                      • cautiouscat

                                                                                                                                        today at 5:42 PM

                                                                                                                                        I assume consumers aren’t a big note in their bottom line. I’m not actually very sure about that, just an assumption.

                                                                                                                                        What I wonder however is if these tools will become something I use at work only. $100/month is already a massive stretch budget wise. If these models keep devouring tokens there’s no way I’d get the same usage time out of them for $100 in usage credits.

                                                                                                                                        I just don’t think I’d use them much at all at home.

                                                                                                                                • oersted

                                                                                                                                  today at 5:38 PM

                                                                                                                                  > Pricing for both models is $10 per million input tokens and $50 per million output tokens.

                                                                                                                                  The step-up in intelligence looks massive (we'll see in practice), but the price is getting to a point where it's making me question if it's even worth giving it a try.

                                                                                                                                  Good competitors will probably be out soon, which should level the playing field. I am more excited about that, just the fact that they showed that such an improvement is possible. I'm okay waiting a bit longer for this to become attainable for plebs like me.

                                                                                                                                    • kolinko

                                                                                                                                      today at 6:00 PM

                                                                                                                                      The pricing can be a bit deceptive though. A good model can deliver the same results in fewer tokens.

                                                                                                                                      Kind of like billing a programmer by the hour.

                                                                                                                                      • xyzsparetimexyz

                                                                                                                                        today at 5:57 PM

                                                                                                                                        This is probably the end of 'use the best model no matter the price'

                                                                                                                                        • sourcecodeplz

                                                                                                                                          today at 5:54 PM

                                                                                                                                          Why wouldn't it be? How much would you pay a scientist at this point to think about a problem for you and give you a solution?

                                                                                                                                      • systemvoltage

                                                                                                                                        today at 6:14 PM

                                                                                                                                        It's interesting that we are seeing a time when subscriptions are not preferred and usage-based billing is.

                                                                                                                                        Pay-as-you go isn't a common thing in SaaS. For example, except for AWS SES, all email providers are bulk-subscription based.

                                                                                                                                        • nutjob2

                                                                                                                                          today at 6:07 PM

                                                                                                                                          > "offer, then remove"

                                                                                                                                          Sounds like "bait and wait".

                                                                                                                                          If you think about it, the more people pay for these new and more resource hungry models, the longer it takes for them to become no extra cost and the longer it takes the more people are tempted to pay extra.

                                                                                                                                          • FergusArgyll

                                                                                                                                            today at 6:00 PM

                                                                                                                                            I'm about to be priced out of SOTA llms and it's an awful feeling

                                                                                                                                            • aray07

                                                                                                                                              today at 5:22 PM

                                                                                                                                              i have never seen this before - where you offer something and then take that away

                                                                                                                                                • machomaster

                                                                                                                                                  today at 5:28 PM

                                                                                                                                                  Really, you have never heard of shareware or trial periods?

                                                                                                                                                    • tasuki

                                                                                                                                                      today at 6:01 PM

                                                                                                                                                      Either that or it was sarcasm. What do you think more likely?

                                                                                                                                              • rvz

                                                                                                                                                today at 5:15 PM

                                                                                                                                                > * On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits. If capacity allows, we’ll extend the included window.

                                                                                                                                                Of course, they are a casino as well giving you free spins at the wheel with their new Fable machine, and it is done on purpose.

                                                                                                                                                Once there freebies have expired, many of its users will begin to gamble more on the new casino machine and will realize that it is expensive.

                                                                                                                                                  • xvector

                                                                                                                                                    today at 5:20 PM

                                                                                                                                                    If it's that big of a problem to you, you're free to just... not use the freebie?

                                                                                                                                                      • cautiouscat

                                                                                                                                                        today at 5:24 PM

                                                                                                                                                        It’s an interesting thing to bring up because it’s this classic thing we’ve seen for decades now.

                                                                                                                                                        The ramifications go beyond the individual which is why I assume they mentioned it. They don’t need to use it/not use it for it to have interesting implications.

                                                                                                                                                          • xvector

                                                                                                                                                            today at 5:26 PM

                                                                                                                                                            so it'd be preferable if they didn't include the model at all?

                                                                                                                                                              • cautiouscat

                                                                                                                                                                today at 5:35 PM

                                                                                                                                                                I didn’t say that and I don’t have a feeling on that either way. But this is a limited time trial and calling it out as such is valid.

                                                                                                                                                                Is it nice we get the trial? Sure. Is it also a common play in the playbook of tech companies? Yes.

                                                                                                                                                        • danslo

                                                                                                                                                          today at 5:54 PM

                                                                                                                                                          It's not a freebie, it still requires a subscription and burns tokens twice as fast as Opus.

                                                                                                                                                  • meowface

                                                                                                                                                    today at 5:26 PM

                                                                                                                                                    It's very disappointing but I'm assuming it's for rational reasons on their part.

                                                                                                                                                    • firemelt

                                                                                                                                                      today at 5:46 PM

                                                                                                                                                      damn they are drugs dealer

                                                                                                                                                      • today at 5:13 PM

                                                                                                                                                    • simonw

                                                                                                                                                      today at 5:12 PM

                                                                                                                                                      Pelican for Fable 5 on default settings is a clear improvement on Opus 4.8

                                                                                                                                                      Fable 5 default: https://gist.github.com/simonw/036bee5a703e7ec84e34efa974438...

                                                                                                                                                      Opus 4.8 (the "max" one is closest to Fable): https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-s...

                                                                                                                                                      Now here are the Fable pelicans for all five of the thinking effort levels - low, medium, high, xhigh, max: https://tools.simonwillison.net/markdown-svg-renderer#url=ht...

                                                                                                                                                      Low used 25 input, 1,929 output - 9.67 cents: https://www.llm-prices.com/#it=25&ot=1929&sel=claude-fable-5

                                                                                                                                                      Max used 25 input, 14,430 output - 72.175 cents! https://www.llm-prices.com/#it=25&ot=14430&sel=claude-fable-...

                                                                                                                                                        • sempron64

                                                                                                                                                          today at 5:44 PM

                                                                                                                                                          The pelican has looked very same-y across all frontier models, same color bike, same camera angle, etc. I suspect this challenge is already too embedded in the training data to be a good signal when it succeeds, and maybe even when it fails in pathological ways mirroring existing AI pelicans on the internet.

                                                                                                                                                            • tripleee

                                                                                                                                                              today at 6:03 PM

                                                                                                                                                              I'd say it's working great for its intended purpose. Keeps Simon on top of all these threads and funnels traffic to his site.

                                                                                                                                                          • sarreph

                                                                                                                                                            today at 5:20 PM

                                                                                                                                                            I'm beginning to wonder how much of a useful metric the pelican is because surely the frontier labs must be training their models on pelican-artistry because of how well known your test is now?

                                                                                                                                                              • bensyverson

                                                                                                                                                                today at 5:56 PM

                                                                                                                                                                Simon has addressed this on virtually every new model release. He also has unpublished alternate prompts. But the larger point is: this is a fun experiment, not a serious and objective benchmark.

                                                                                                                                                                  • refulgentis

                                                                                                                                                                    today at 6:19 PM

                                                                                                                                                                    It's silly and a joke and a surprisingly good benchmark and don't take it seriously but don't take not taking it seriously seriously and if it's too good we use another prompt but don't actually because then it's not the pelican post and there's obvious ways to better it and it's not worth doing because it's not serious.

                                                                                                                                                                    Only coherent move at this point: hit the minus button immediately. There's never anything about the model in the thread other than simon's post.

                                                                                                                                                                • wongarsu

                                                                                                                                                                  today at 5:41 PM

                                                                                                                                                                  I just run my own benchmark for "draw an SVG with $animal driving $vehicle". I won't post my choice of animal and mode of transport, but there are plenty of uncommon combinations to choose from. So far it's a fun and visually intuitive benchmark that does seem to correlate with model capabilities

                                                                                                                                                                  • modriano

                                                                                                                                                                    today at 5:34 PM

                                                                                                                                                                    I don't know. Just looking at the bike frames (specifically the fact that the AI generated bikes have rather unsteerable front forks), it's clear to me that frontier labs aren't spending much time tuning models to make bikes look coherent, which I assume is an easier task than making a pelican riding a bike look coherent.

                                                                                                                                                                    • HaZeust

                                                                                                                                                                      today at 5:23 PM

                                                                                                                                                                      I've seen this reply to Simon's benchmark for 2 years running now, and yet you still see improvements and objectively-bad results over time from new releases, even when I'm sure every frontier AI team has/had a person at least partially dedicated to better bicycle-pelican SVG outputs. Alas.

                                                                                                                                                                        • sarreph

                                                                                                                                                                          today at 5:26 PM

                                                                                                                                                                          I had intended to caveat that: I'm sure I'm not the first person to ask about this!

                                                                                                                                                                          > you still see improvements

                                                                                                                                                                          This is expected if they are training their models on it, right?

                                                                                                                                                                          > objectively-bad results

                                                                                                                                                                          Keen to learn when this has been the case, i.e. across version increments in major models.

                                                                                                                                                                            • simonw

                                                                                                                                                                              today at 5:29 PM

                                                                                                                                                                              I've written about this a couple of times, most notably here: https://simonwillison.net/2025/Nov/13/training-for-pelicans-...

                                                                                                                                                                              I've been enjoying seeing how the quality of individual models differ based on the amount of reasoning effort you give them. If they were baking an a good pelican you wouldn't expect them to differ so much.

                                                                                                                                                                              (Google Gemini are the only lab that have very clearly paid attention to the quality of SVG animals-riding-vehicles, see their announcement for Gemini 3.1: https://twitter.com/JeffDean/status/2024525132266688757 )

                                                                                                                                                                                • sarreph

                                                                                                                                                                                  today at 5:31 PM

                                                                                                                                                                                  Amazing, thank you Simon! Look forward to reading.

                                                                                                                                                                                    • 38484858

                                                                                                                                                                                      today at 6:02 PM

                                                                                                                                                                                      [flagged]

                                                                                                                                                                          • llm_nerd

                                                                                                                                                                            today at 5:32 PM

                                                                                                                                                                            I honestly assumed their comment was tongue in cheek humour, because positively no one actually cares how these models generate an SVG pelican riding a bicycle. It's some meme thing that this stuff always appears here.

                                                                                                                                                                              • BrokenCogs

                                                                                                                                                                                today at 5:38 PM

                                                                                                                                                                                Yeah this is not a real benchmark, it's just a fun tradition everytime a new model is released

                                                                                                                                                                                  • pelipost123

                                                                                                                                                                                    today at 5:47 PM

                                                                                                                                                                                    "fun" / boringly predictable meme thread with 30+ replies already

                                                                                                                                                                    • ealready_value

                                                                                                                                                                      today at 5:15 PM

                                                                                                                                                                      This is the reply I look for in all the new model announcements. Its fun to tell people that I judge models based on pelicans.

                                                                                                                                                                        • pixel_popping

                                                                                                                                                                          today at 5:34 PM

                                                                                                                                                                          This is all we need, that moment the Pelican put the leg behind the frame, we are all doomed.

                                                                                                                                                                          • chorkpop

                                                                                                                                                                            today at 5:17 PM

                                                                                                                                                                            Now someone post the link about how it’s impossible for humans to draw a bike from memory.

                                                                                                                                                                        • ethanlipson

                                                                                                                                                                          today at 5:19 PM

                                                                                                                                                                          How much money do you think they spent fine-tuning on pelican SVG generation?

                                                                                                                                                                            • tarruda

                                                                                                                                                                              today at 5:26 PM

                                                                                                                                                                              Not as much as Qwen, since apparently 3.6 35B surpassed Opus 4.7 https://x.com/simonw/status/2044830134885306701

                                                                                                                                                                              • csomar

                                                                                                                                                                                today at 5:23 PM

                                                                                                                                                                                Probably none. They probably have much better targets to optimize for than an SVG pelican or even SVGs in general.

                                                                                                                                                                            • rkuska

                                                                                                                                                                              today at 5:39 PM

                                                                                                                                                                              Is it possible to use the credits from subscription (https://support.claude.com/en/articles/15036540-use-the-clau...) for fable?

                                                                                                                                                                              • leecommamichael

                                                                                                                                                                                today at 5:20 PM

                                                                                                                                                                                Looks like Fable constructed the "max" "looking" pelican of the previous model for the "xhigh" output token count of the previous model.

                                                                                                                                                                                • 382hi

                                                                                                                                                                                  today at 5:41 PM

                                                                                                                                                                                  I'm pretty sure they're optimizing the models around these sorts of tests.

                                                                                                                                                                                  • makingstuffs

                                                                                                                                                                                    today at 5:52 PM

                                                                                                                                                                                    I could be tripping but I’m sure that is very similar to the Deepseek one from not long ago. Clearly I am too lazy to go and find it for verification.

                                                                                                                                                                                    • redox99

                                                                                                                                                                                      today at 5:28 PM

                                                                                                                                                                                      It's interesting that they still get the head tube / handle bar part wrong.

                                                                                                                                                                                        • aarjaneiro

                                                                                                                                                                                          today at 5:44 PM

                                                                                                                                                                                          Or the hands not being wings

                                                                                                                                                                                      • mercacona

                                                                                                                                                                                        today at 5:18 PM

                                                                                                                                                                                        Why always sunny days?

                                                                                                                                                                                          • umeshunni

                                                                                                                                                                                            today at 5:25 PM

                                                                                                                                                                                            Pelicans hate biking in the rain (as do I).

                                                                                                                                                                                        • csomar

                                                                                                                                                                                          today at 5:22 PM

                                                                                                                                                                                          Where is the clear improvement on Fable 5? The tail is misplaced.

                                                                                                                                                                                          • david_shi

                                                                                                                                                                                            today at 5:40 PM

                                                                                                                                                                                            that's a great looking pelican

                                                                                                                                                                                            • ge96

                                                                                                                                                                                              today at 5:32 PM

                                                                                                                                                                                              need more Alex Moulton style bikes

                                                                                                                                                                                              • kylehotchkiss

                                                                                                                                                                                                today at 5:32 PM

                                                                                                                                                                                                How many barrels of oil are burned per pelican at Fable levels?

                                                                                                                                                                                                • simunskxcsckss

                                                                                                                                                                                                  today at 5:17 PM

                                                                                                                                                                                                  [flagged]

                                                                                                                                                                                                    • minimaxir

                                                                                                                                                                                                      today at 5:21 PM

                                                                                                                                                                                                      You can't tell someone to "get a life" while taking the effort to create a burner account for the sole purpose of insulting someone.

                                                                                                                                                                                                      • rvz

                                                                                                                                                                                                        today at 5:20 PM

                                                                                                                                                                                                        I don't really consider that a great benchmark anyway and we really need better ones that are objective instead of these mostly performative and cheatable and also available in the training set.

                                                                                                                                                                                                        • ilaksh

                                                                                                                                                                                                          today at 5:20 PM

                                                                                                                                                                                                          Simon's pelicans are an institution. Are you trying to get banned. Lmao.

                                                                                                                                                                                                            • rob

                                                                                                                                                                                                              today at 5:45 PM

                                                                                                                                                                                                              I think it's a clever thing he did to basically guarantee he continues to get major traffic to his blog here every time a model is released, especially since he's taking sponsorships with a static banner at the top of every page now. I think he's trying to go the Daring Fireball route.

                                                                                                                                                                                                              • today at 5:24 PM

                                                                                                                                                                                                                • brazukadev

                                                                                                                                                                                                                  today at 5:25 PM

                                                                                                                                                                                                                  For me it is like if crypto bros were allowed to shill their DAOs and tokens during the crypto/NFT phase.

                                                                                                                                                                                                                  He is the only person not getting rate-limited for shilling AI all the time.

                                                                                                                                                                                                                    • simonw

                                                                                                                                                                                                                      today at 5:42 PM

                                                                                                                                                                                                                      Pointing out how much the models still suck at drawing pelicans is a funny way to shill them.

                                                                                                                                                                                                                        • toraway

                                                                                                                                                                                                                          today at 6:06 PM

                                                                                                                                                                                                                          Tbf the first line of your first comment is:

                                                                                                                                                                                                                            > Pelican for Fable 5 on default settings is a clear improvement on Opus 4.8
                                                                                                                                                                                                                          
                                                                                                                                                                                                                          And doesn't contain any actual criticism within the comment (your blog post might, but just referring to what was posted on HN, which is a bit booster-y on its own).

                                                                                                                                                                                                                            • simonw

                                                                                                                                                                                                                              today at 6:18 PM

                                                                                                                                                                                                                              The entire pelican benchmark is a joke. The joke is that, for all of the billions of dollars poured into these things and the claims of PhD level intelligence, they still draw pelicans not-much-better than a five year-old would.

                                                                                                                                                                                                                              I don't spell that joke out in every comment I post here because that wouldn't be very funny.

                                                                                                                                                                                                      • cuuupid

                                                                                                                                                                                                        today at 5:15 PM

                                                                                                                                                                                                        Not missing the forest for the trees, this effectively means in 3-5 months China will drop open source models that are every bit as capable and dangerous as current day Mythos except with no safeguards.

                                                                                                                                                                                                        And the only companies safe from this are the large corporations that shook hands with Anthropic? Because Fable doesn't seem to have actual safeguards, more like 'if you talk about this you will be talking to Opus.' It doesn't guard against offensive use, it prevents all use (offensive AND defensive).

                                                                                                                                                                                                        Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF

                                                                                                                                                                                                          • hootz

                                                                                                                                                                                                            today at 5:22 PM

                                                                                                                                                                                                            My bet is that Mythos is still over-hyped and the cybersecurity fear and guardrails are mostly marketing to force company partnerships through Glasswing and get public attention.

                                                                                                                                                                                                              • miohtama

                                                                                                                                                                                                                today at 5:52 PM

                                                                                                                                                                                                                Mythos is from the same guy who did "GPT-2 is too dangerous to release"

                                                                                                                                                                                                                https://naokishibuya.github.io/blog/2022-12-30-gpt-2-2019/

                                                                                                                                                                                                                  • oceansky

                                                                                                                                                                                                                    today at 5:59 PM

                                                                                                                                                                                                                    He was kinda right.

                                                                                                                                                                                                                    Lawyers, doctors, students, teachers. Lots of people using GPT models carelessly in harmful ways.

                                                                                                                                                                                                                • bel8

                                                                                                                                                                                                                  today at 5:48 PM

                                                                                                                                                                                                                  It worked for OpenAI when GPT 3 was deemed too dangerous to be released. This is just a spin of that.

                                                                                                                                                                                                                    • hootz

                                                                                                                                                                                                                      today at 5:55 PM

                                                                                                                                                                                                                      I still remember it. "Open"AI going API-only because GPT-3 is really really dangerous, so forget the Open in our name and all of that, you can't download our models anymore and must request access to them because they pose a THREAT.

                                                                                                                                                                                                                      Fast forward to today and GPT-3 has laughable performance.

                                                                                                                                                                                                                        • shoeb00m

                                                                                                                                                                                                                          today at 6:12 PM

                                                                                                                                                                                                                          Even back then there were plenty of people who got fooled by AI generated articles. It's easier to spot AI writing now because we are so used to it. They were right to be concerned; not that it achieved much since oss models run laps around gpt-3 now.

                                                                                                                                                                                                                            • hootz

                                                                                                                                                                                                                              today at 6:21 PM

                                                                                                                                                                                                                              But it seems like that was not genuine concern, but instead a tactic to pivot to closed models and an API service with an excuse to do so, breaking the public's expectation that they would be a non-profit making open models, like their name implies.

                                                                                                                                                                                                                  • geerlingguy

                                                                                                                                                                                                                    today at 5:33 PM

                                                                                                                                                                                                                    Bingo.

                                                                                                                                                                                                                    "We had to do extra work to make this safe because it's so advanced and dangerous..." how many times can they trot out that line before it loses its effect entirely?

                                                                                                                                                                                                                      • OtomotO

                                                                                                                                                                                                                        today at 6:29 PM

                                                                                                                                                                                                                        With homo "sapiens" "sapiens"? A few decades at least.

                                                                                                                                                                                                                        • copperx

                                                                                                                                                                                                                          today at 6:22 PM

                                                                                                                                                                                                                          Only three times, if fables are right.

                                                                                                                                                                                                                          • aesthesia

                                                                                                                                                                                                                            today at 6:23 PM

                                                                                                                                                                                                                            I mean, they do actually describe what that extra work was, and people elsewhere in this thread are complaining about the effects of those safeguards. So it's not like this is purely empty rhetoric.

                                                                                                                                                                                                                        • ls612

                                                                                                                                                                                                                          today at 5:36 PM

                                                                                                                                                                                                                          And to ensure that only USG-approved entities are allowed to secure their code.

                                                                                                                                                                                                                      • mpeg

                                                                                                                                                                                                                        today at 5:33 PM

                                                                                                                                                                                                                        It's not even very usable... I tried 2 different chats and both eventually got stopped due to the safeguards

                                                                                                                                                                                                                        One was a piece of code I gave it to improve, it did so and then started writing tests, some of which tested security so the safeguards triggered

                                                                                                                                                                                                                        Another was one of the cryptography puzzles I use as new model tests, which are hard to oneshot and there's no public solution anywhere, it completely refused to even try to solve it

                                                                                                                                                                                                                          • Erem

                                                                                                                                                                                                                            today at 6:04 PM

                                                                                                                                                                                                                            So the degradation to Opus 4.8 from the article isn't happening in practice?

                                                                                                                                                                                                                              • mtkd

                                                                                                                                                                                                                                today at 6:16 PM

                                                                                                                                                                                                                                No, you get a AUP violation and have to manually swap the model

                                                                                                                                                                                                                                (I had same issue, just asked it to check some code that 4.8 had modified earlier in day)

                                                                                                                                                                                                                                • andai

                                                                                                                                                                                                                                  today at 6:09 PM

                                                                                                                                                                                                                                  Maybe that's only in the chat UI, and not the API?

                                                                                                                                                                                                                          • himata4113

                                                                                                                                                                                                                            today at 5:27 PM

                                                                                                                                                                                                                            They're trained in a model class likely in 2t to 3t range. It's very unlikely that chinese labs have access to gpu systems capable of training models like that, let alone serving them. This requires proprietary room-scale systems which fetch a huge premium over typical 10 slot systems.

                                                                                                                                                                                                                            I am sure that they can develop their own equivlient version of such clusters in around 1 year though. Distilling fabel 5 will also go a long way.

                                                                                                                                                                                                                              • logicprog

                                                                                                                                                                                                                                today at 5:33 PM

                                                                                                                                                                                                                                DSv4 is nearly in the 2t range, but yes you're generally right

                                                                                                                                                                                                                                  • himata4113

                                                                                                                                                                                                                                    today at 5:37 PM

                                                                                                                                                                                                                                    MoE experts were likely trained independently / in a sparse format. Training anything beyond 2t on typical systems would be infuriantingly slow, you could do 4t on nvidias room-scale solution, but for a reasonable training speed / batch size it caps around 3t.

                                                                                                                                                                                                                                      • sosodev

                                                                                                                                                                                                                                        today at 5:48 PM

                                                                                                                                                                                                                                        Do you have any resources to share regarding independent expert training? I was under the impression that it's not feasible.

                                                                                                                                                                                                                                          • himata4113

                                                                                                                                                                                                                                            today at 6:01 PM

                                                                                                                                                                                                                                            concept is similar to how it works in inference, instead of performing regressive writes to the entire model you run the whole model, but part of the model can live in system memory and get swapped in/out on demand. So only XB parameters are active in training.

                                                                                                                                                                                                                                            edit: I am not really sure if it works like that. I haven't looked too deep into deepseek v4 pro specifically.

                                                                                                                                                                                                                            • dmantis

                                                                                                                                                                                                                              today at 5:33 PM

                                                                                                                                                                                                                              Isn't that a good thing in a way? If everyone has the weapon and defense at the same time, we will fix security holes and live safer lifes instead of having some three letter agencies and military backdoors in everything.

                                                                                                                                                                                                                              Pandora box is open anyway. It's better now for everyone to have the same power rather than a few national states.

                                                                                                                                                                                                                                • lebovic

                                                                                                                                                                                                                                  today at 6:05 PM

                                                                                                                                                                                                                                  Not sure this holds, sadly. I spent a few months reporting serious security bugs as model capabilities took off earlier this year, and only ~half were fixed. The unfixed bugs were just as critical as the fixed ones; sometimes they were even two similarly critical bugs at the same company, and only one would be fixed!

                                                                                                                                                                                                                                  On your other point, the government still has systemic leverage and can compel access, so this doesn't remove that risk.

                                                                                                                                                                                                                                  That doesn't mean this is the end of the world, and some balance of power is usually good. But I do think it will still increase the capabilties of rogue actors and their net harm.

                                                                                                                                                                                                                              • sosodev

                                                                                                                                                                                                                                today at 5:55 PM

                                                                                                                                                                                                                                I wonder if model distillation will continue to work as well as it has. Given hidden reasoning, the ever expanding number of expected capabilities, a serious compute shortage, the looming possibility of model collapse, and dramatically higher API costs I would guess that it's getting much harder to do.

                                                                                                                                                                                                                                • ibejoeb

                                                                                                                                                                                                                                  today at 6:11 PM

                                                                                                                                                                                                                                  I don't think China has any incentive to arm the rest of the world with highly capable models that can be used against them. Undoubtedly they will continue with the arms race, but they will preserve the best stuff for their own use.

                                                                                                                                                                                                                                    • james2doyle

                                                                                                                                                                                                                                      today at 6:18 PM

                                                                                                                                                                                                                                      I think the stronger incentive is undermining/undercutting the Western AI companies. Given what we have seen, any model can be used/convinced to do harm so that is just part of the game

                                                                                                                                                                                                                                        • ibejoeb

                                                                                                                                                                                                                                          today at 6:28 PM

                                                                                                                                                                                                                                          I agree, depending on how much of this is marketing and how much is actual capability. It's one thing to undercut models that finish writing assignments for lazy students. If this actually identifies vulns and writes exploits, or if it designs bioweapons, those are pretty different. Those are actual weapons, and I don't think they're going to arm the adversary.

                                                                                                                                                                                                                                  • soledades

                                                                                                                                                                                                                                    today at 6:02 PM

                                                                                                                                                                                                                                    > Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF.

                                                                                                                                                                                                                                    Based.

                                                                                                                                                                                                                                    • jstummbillig

                                                                                                                                                                                                                                      today at 5:57 PM

                                                                                                                                                                                                                                      I wonder where the trees are. In this thread nobody appears to actually be talking about the model.

                                                                                                                                                                                                                                      • FergusArgyll

                                                                                                                                                                                                                                        today at 6:06 PM

                                                                                                                                                                                                                                        I think we're about to see a big relative drop-off of open models vs closed. I don't think there'll be an open model that competes with Mythos for ~2 years.

                                                                                                                                                                                                                                        Even OpenAI and Google are struggling to get this kind of performance. If the distillation defenses are any good + chip controls prevent China from training massive models, it's over.

                                                                                                                                                                                                                                        • deaton

                                                                                                                                                                                                                                          today at 5:53 PM

                                                                                                                                                                                                                                          Oh they might try to put in place safeguards, but Qwen has had no problem being abliterated

                                                                                                                                                                                                                                          • m3kw9

                                                                                                                                                                                                                                            today at 5:34 PM

                                                                                                                                                                                                                                            3-5 months is a long time and they are pretty useless on arrival because the frontier models are so good, that it's hard to go back even if it's way cheaper. Your work flow is adapted to that level of intelligence for months.

                                                                                                                                                                                                                                              • hootz

                                                                                                                                                                                                                                                today at 5:53 PM

                                                                                                                                                                                                                                                That doesn't match my experience at all. I can't see myself saying in 6 months that the current model I am using is useless, that makes no sense.

                                                                                                                                                                                                                                                In fact, I did go back to DeepSeek V4 Flash for most of my problems as it is way cheaper and there is no need to use SOTA for absolutely everything.

                                                                                                                                                                                                                                            • xdennis

                                                                                                                                                                                                                                              today at 5:45 PM

                                                                                                                                                                                                                                              > every bit as capable and dangerous as current day Mythos except with no safeguards

                                                                                                                                                                                                                                              Not quite. They will definitely have "no criticism of China/communism" safeguards.

                                                                                                                                                                                                                                                • hootz

                                                                                                                                                                                                                                                  today at 5:48 PM

                                                                                                                                                                                                                                                  People can work around those if they are open-weight.

                                                                                                                                                                                                                                                  • xyzsparetimexyz

                                                                                                                                                                                                                                                    today at 6:14 PM

                                                                                                                                                                                                                                                    Trying asking fable is Israel is committing a genocide

                                                                                                                                                                                                                                            • dannyw

                                                                                                                                                                                                                                              today at 6:03 PM

                                                                                                                                                                                                                                              Impressions from testing Fable 5 prior to launch:

                                                                                                                                                                                                                                              • My most noticeable immediate jump was in how its frontend design was much more intentionally crafted, and delightful without feeling like 'AI vibe coded'; with better end-user usability too.

                                                                                                                                                                                                                                              • In some internal agentic harnesses, it achieved better results with about half the tokens, making it cost the ~same as Opus 4.8 price-wise! The real price increase is less than 2x; with biggest differences in harder problems where Opus 4.8 struggles (or needs many turns).

                                                                                                                                                                                                                                              • Part of the token efficiency improvements come from Fable doing more targeted and surgical diffs, with less non-necessary changes. This is great, because PRs often have less LoC changes for review. It writes more maintainable code without explicit human steering.

                                                                                                                                                                                                                                              • For general conversation and assistant style use cases, didn’t really notice a difference vs 4.8.

                                                                                                                                                                                                                                              • 1M context window, without increased pricing for long context is AWESOME. This is a massive win.

                                                                                                                                                                                                                                              • The classifiers are super aggressive and sensitive and this does happen for very benign, non-security coding tasks. Fallbacks to 4.8 worked like a charm; but the filters are definitely super sensitive.

                                                                                                                                                                                                                                              Overall, I would describe this as a step change and worthy of the "Claude 5" model name. It did take some time to understand the intelligence ceiling of this model; and even with an extended testing window I'm still discovering new things and often surprised (in a good way) by the model.

                                                                                                                                                                                                                                                • morley

                                                                                                                                                                                                                                                  today at 6:21 PM

                                                                                                                                                                                                                                                  Can I ask how you gained preview access to Fable 5?

                                                                                                                                                                                                                                              • andai

                                                                                                                                                                                                                                                today at 5:26 PM

                                                                                                                                                                                                                                                > Distillation. We’ve previously identified large-scale attempts to extract (“distill”) Claude’s capabilities to train competing models in authoritarian countries.

                                                                                                                                                                                                                                                Glad to hear the UK is finally making an effort to catch up on the AI front ;)

                                                                                                                                                                                                                                                  • james2doyle

                                                                                                                                                                                                                                                    today at 6:20 PM

                                                                                                                                                                                                                                                    Just last week you could distill using other users responses! Handy!

                                                                                                                                                                                                                                                    • b3kart

                                                                                                                                                                                                                                                      today at 5:47 PM

                                                                                                                                                                                                                                                      https://en.wikipedia.org/wiki/The_Economist_Democracy_Index

                                                                                                                                                                                                                                                      Probably tongue-in-cheek, but UK 18th, US joint 34th with Poland

                                                                                                                                                                                                                                                        • m0guz

                                                                                                                                                                                                                                                          today at 6:00 PM

                                                                                                                                                                                                                                                          > The Democracy Index published by the British media company

                                                                                                                                                                                                                                                          We decided that we aren't one of those authoritarian countries.

                                                                                                                                                                                                                                                          • Petersipoi

                                                                                                                                                                                                                                                            today at 6:09 PM

                                                                                                                                                                                                                                                            > published by the British media company the Economist Group

                                                                                                                                                                                                                                                            Haha, it's literally the first sentence of the Wikipedia page. That's fucking funny. Try again.

                                                                                                                                                                                                                                                            • solenoid0937

                                                                                                                                                                                                                                                              today at 5:58 PM

                                                                                                                                                                                                                                                              Most of these indexes are made by ideologically motivated people.

                                                                                                                                                                                                                                                              In the UK you get thrown in prison for making a slightly unfriendly tweet. Freedom of speech simply does not exist.

                                                                                                                                                                                                                                                              No sane person sees that as being less authoritarian.

                                                                                                                                                                                                                                                                • 10xDev

                                                                                                                                                                                                                                                                  today at 6:02 PM

                                                                                                                                                                                                                                                                  >the quality of discussion on HN has gone to shit, i miss when model released used to have actual informed takes from people that used them or substantive discussion about the system card

                                                                                                                                                                                                                                                                  Your comment earlier.

                                                                                                                                                                                                                                                                  Edit: also, not much change in the last 10 years in prison population. https://commonslibrary.parliament.uk/research-briefings/sn04...

                                                                                                                                                                                                                                                                  • JustSkyfall

                                                                                                                                                                                                                                                                    today at 6:01 PM

                                                                                                                                                                                                                                                                    > In the UK you get thrown in prison for making a slightly unfriendly tweet.

                                                                                                                                                                                                                                                                    Do you? The closest thing I can think about is how someone was jailed for encouraging arson attacks on asylum hotels. I'd be extremely surprised if the US had zero cases of somebody receiving a police visit after threatening to kill the President or bomb a school or something...

                                                                                                                                                                                                                                                                    (FWIW I do think the UK needs stronger free speech protections, but saying that you'll be immediately jailed for writing unfriendly tweets is a huge stretch)

                                                                                                                                                                                                                                                          • dyauspitr

                                                                                                                                                                                                                                                            today at 5:45 PM

                                                                                                                                                                                                                                                            Rookie numbers. Come to the US to see auth done right.

                                                                                                                                                                                                                                                              • today at 5:48 PM

                                                                                                                                                                                                                                                                • PUSH_AX

                                                                                                                                                                                                                                                                  today at 6:12 PM

                                                                                                                                                                                                                                                                  Uh oh-auth

                                                                                                                                                                                                                                                          • HoyaSaxa

                                                                                                                                                                                                                                                            today at 6:31 PM

                                                                                                                                                                                                                                                            > When Claude Fable 5 is used, Anthropic retains data, including prompts and outputs, to operate safety classifiers that detect harmful use. Other Claude models in GitHub Copilot remain covered by GitHub's existing data retention agreements

                                                                                                                                                                                                                                                            On GitHub Copilot for Business, Claude Fable 5 is only available if you are willing to let Anthropic retain your data. That in conjunction with the model being removed from plans in a couple of weeks leads me to believe that Anthropic is between training runs and using this as an opportunity to grab way more training data...

                                                                                                                                                                                                                                                            • sigmar

                                                                                                                                                                                                                                                              today at 5:08 PM

                                                                                                                                                                                                                                                              The system card is 319 pages, at what point do we call it a "book" instead of a "card"?

                                                                                                                                                                                                                                                              There's a quote from a METR report on page 52:

                                                                                                                                                                                                                                                              >We ran [Mythos 5] on 38 of our hardest software tasks, including tasks centered around R&D. [Mythos5] generally outperformed an early checkpoint of Claude Mythos Preview in these, including by succeeding on some tasks that had not been solved by any public model we have previously evaluated. However, we still observed the model occasionally failing to correctly interpret nuanced instructions in difficult tasks... Based on the available evidence, we believe [Mythos 5] is likely unable to fully and reliably automate R&D for frontier projects spanning multiple weeks. We believe that a better, more confident assessment would require more time, evaluations, and information from the model developer.

                                                                                                                                                                                                                                                                • baq

                                                                                                                                                                                                                                                                  today at 5:13 PM

                                                                                                                                                                                                                                                                  > we believe [Mythos 5] is likely unable to fully and reliably automate R&D for frontier projects spanning multiple weeks

                                                                                                                                                                                                                                                                  this is good news, right? right...?

                                                                                                                                                                                                                                                                    • yaodub

                                                                                                                                                                                                                                                                      today at 5:33 PM

                                                                                                                                                                                                                                                                      Depends whether "unable to fully automate" means "needs occasional human checkpoints" or "slowly stops caring about your actual goal." Pretty different.

                                                                                                                                                                                                                                                                      • woeirua

                                                                                                                                                                                                                                                                        today at 5:15 PM

                                                                                                                                                                                                                                                                        lmao, i love how the goal post is now in the "multiple weeks" timeline

                                                                                                                                                                                                                                                                          • applfanboysbgon

                                                                                                                                                                                                                                                                            today at 5:20 PM

                                                                                                                                                                                                                                                                            (according to the people marketing it)

                                                                                                                                                                                                                                                                    • romanovcode

                                                                                                                                                                                                                                                                      today at 6:09 PM

                                                                                                                                                                                                                                                                      But did it mention developer in the park eating the sandwitch? That is the most important question!

                                                                                                                                                                                                                                                                  • jkelleyrtp

                                                                                                                                                                                                                                                                    today at 5:10 PM

                                                                                                                                                                                                                                                                    On the new FrontierCode [1] benchmark (ie graded from an OSS maintainer's perspective of "would I merge this code?")

                                                                                                                                                                                                                                                                    - Opus 4.7 xhigh: 5.2%

                                                                                                                                                                                                                                                                    - Opus 4.8 xhigh: 13.4%

                                                                                                                                                                                                                                                                    - Fable 5 xhigh: 29.3%

                                                                                                                                                                                                                                                                    Seems like a huge jump.

                                                                                                                                                                                                                                                                    [1] https://cognition.ai/blog/frontier-code

                                                                                                                                                                                                                                                                      • amluto

                                                                                                                                                                                                                                                                        today at 5:39 PM

                                                                                                                                                                                                                                                                        That blog post really makes it look like it's graded from an LLM's estimation of an OSS maintainer's review. I see three issues:

                                                                                                                                                                                                                                                                        1. That estimate could easily be wrong.

                                                                                                                                                                                                                                                                        2. That estimate is, of course, usable in RL training. This isn't an inherently bad thing, and this is more or less what has improved coding models so much lately. But it does mean that other companies could and surely will do this sort of training, and Anthropic probably did too.

                                                                                                                                                                                                                                                                        3. OSS maintainers are far from perfect, and there's an unfortunate uncanny valley-like effect in which a coding model can produce code that is just convincing enough to pass review even though it's actually totally wrong. I don't know whether this is a specific issue here.

                                                                                                                                                                                                                                                                        • zzleeper

                                                                                                                                                                                                                                                                          today at 5:25 PM

                                                                                                                                                                                                                                                                          How credible is this benchmark? does it correlated with others real world experience?

                                                                                                                                                                                                                                                                            • schipperai

                                                                                                                                                                                                                                                                              today at 6:29 PM

                                                                                                                                                                                                                                                                              Cognition did well in documenting their approach [1].

                                                                                                                                                                                                                                                                              TL;DR - they worked with OSS project maintainers to build tasks. They score models based on whether a PR is mergeable. All tasks are graded by a human researcher. SoTA models have hill-climbing to do which raises the bar and inspires confidence. I'd say it's legit.

                                                                                                                                                                                                                                                                              [1]: https://x.com/cognition/status/2064061031912288715

                                                                                                                                                                                                                                                                              • bfeynman

                                                                                                                                                                                                                                                                                today at 5:59 PM

                                                                                                                                                                                                                                                                                Given it was made by cognition (team behind devin flop) who now just got to wait out until claude and gpt5 basically do all of the work for them - not very. When you read about it, the framework is highly subjective. Which very quickly becomes a problem because its based on heuristics that probably change a bunch with a better code model.

                                                                                                                                                                                                                                                                                  • vanuatu

                                                                                                                                                                                                                                                                                    today at 6:02 PM

                                                                                                                                                                                                                                                                                    the subjective framework is exactly why its good

                                                                                                                                                                                                                                                                                    prior bms relied mostly on unit tests or synthetic judges which are easily benchmaxxed, which leads to nobody trusting benchmarks

                                                                                                                                                                                                                                                                                    we need people manually checking the data for good code quality

                                                                                                                                                                                                                                                                                • vanuatu

                                                                                                                                                                                                                                                                                  today at 6:00 PM

                                                                                                                                                                                                                                                                                  i worked on one of the benchmarks typically found in new model releases

                                                                                                                                                                                                                                                                                  this benchmark looks very good from the methodology. a cog researcher checking the data themselves is very high signal (not scaleable so don't take the benchmark as gospel, but directionally good)

                                                                                                                                                                                                                                                                                  • Catloafdev

                                                                                                                                                                                                                                                                                    today at 5:29 PM

                                                                                                                                                                                                                                                                                    It's a relatively new benchmark but from what I can tell it has serious cred behind it. I assume it will be picked up as part of the standard suite of CS-related benchmarks soon enough.

                                                                                                                                                                                                                                                                                    • emp17344

                                                                                                                                                                                                                                                                                      today at 5:29 PM

                                                                                                                                                                                                                                                                                      Seems like it literally popped up yesterday with the express purpose of building hype for this release.

                                                                                                                                                                                                                                                                                        • osti

                                                                                                                                                                                                                                                                                          today at 6:13 PM

                                                                                                                                                                                                                                                                                          And notable absence of DeepSWE benchmark where they do badly, but somehow a benchmark that was published yesterday is in this announcement.

                                                                                                                                                                                                                                                                                          • vanuatu

                                                                                                                                                                                                                                                                                            today at 5:57 PM

                                                                                                                                                                                                                                                                                            i doubt it, cog wants coding agents to be better because it directly improves their product

                                                                                                                                                                                                                                                                                            they aren't married to a particular lab, most of their usage is their in house model i believe

                                                                                                                                                                                                                                                                                            • anthonypasq

                                                                                                                                                                                                                                                                                              today at 5:33 PM

                                                                                                                                                                                                                                                                                              what incentive does Cognition have for doing this? seems like complete nonsense speculation on your part.

                                                                                                                                                                                                                                                                                                • bel8

                                                                                                                                                                                                                                                                                                  today at 5:45 PM

                                                                                                                                                                                                                                                                                                  With billions/trillions of dollars floating around, is it hard to imagine benchmarks could be biased?

                                                                                                                                                                                                                                                                                                  I think it's safe to assume everything AI related is heavily biased until proven otherwise. Just like in pharma.

                                                                                                                                                                                                                                                                                                    • camdenreslink

                                                                                                                                                                                                                                                                                                      today at 6:22 PM

                                                                                                                                                                                                                                                                                                      People game benchmarks for fake internet points to get their favorite web framework to the top of the list. I'm pretty sure they will do it for billions of dollars.

                                                                                                                                                                                                                                                                                      • hydra-f

                                                                                                                                                                                                                                                                                        today at 5:17 PM

                                                                                                                                                                                                                                                                                        Yes, and the price reflects that

                                                                                                                                                                                                                                                                                          • leecommamichael

                                                                                                                                                                                                                                                                                            today at 5:22 PM

                                                                                                                                                                                                                                                                                            I'm not familiar with model pricing trends, did they clearly state how the new pricing compares? (Note that I'm actually asking a question, and am not arguing)

                                                                                                                                                                                                                                                                                            EDIT: Oh I see, this is the best link for pricing https://platform.claude.com/docs/en/about-claude/pricing

                                                                                                                                                                                                                                                                                            So the price is double across the board...

                                                                                                                                                                                                                                                                                              • bhelkey

                                                                                                                                                                                                                                                                                                today at 5:28 PM

                                                                                                                                                                                                                                                                                                >Fable 5 and Mythos 5 are being offered at $10 per million input tokens and $50 per million output tokens

                                                                                                                                                                                                                                                                                                From their pricing page, Opus 4.8 costs $5 per million input tokens and $25 per million output tokens [1].

                                                                                                                                                                                                                                                                                                [1] https://platform.claude.com/docs/en/about-claude/models/over...

                                                                                                                                                                                                                                                                                                  • wongarsu

                                                                                                                                                                                                                                                                                                    today at 5:47 PM

                                                                                                                                                                                                                                                                                                    Still cheaper than Opus 4.0 and 4.1 (which was and still is $15/MTok input and $75/MTok output)

                                                                                                                                                                                                                                                                                                    I would have expected Mythos to be much more expensive than just 2x current Opus (which is clearly cheaper to run than original Opus)

                                                                                                                                                                                                                                                                                                • hydra-f

                                                                                                                                                                                                                                                                                                  today at 5:29 PM

                                                                                                                                                                                                                                                                                                  As per OpenRouter:

                                                                                                                                                                                                                                                                                                  Input Price $10/M tokens

                                                                                                                                                                                                                                                                                                  Output Price $50/M tokens

                                                                                                                                                                                                                                                                                                  Cache Read $1/M tokens

                                                                                                                                                                                                                                                                                                  Cache Write $12.50/M tokens

                                                                                                                                                                                                                                                                                                  2x Claude Opus 4.8, same as Claude Opus 4.8 (Fast)

                                                                                                                                                                                                                                                                                                  Frankly, not even Opus 4.8 would be enough of an incentive to use at that price range (enterprise-wise; would not even bat an eye as a consumer)

                                                                                                                                                                                                                                                                                          • m3kw9

                                                                                                                                                                                                                                                                                            today at 5:32 PM

                                                                                                                                                                                                                                                                                            FrontierCode is likely paid for by anthropic.

                                                                                                                                                                                                                                                                                              • lanthissa

                                                                                                                                                                                                                                                                                                today at 5:39 PM

                                                                                                                                                                                                                                                                                                did they not pay them enough to get good ratings on the other 3 models?

                                                                                                                                                                                                                                                                                                whats the logic in claiming its a borked metric when everything listed is an anthropic model.

                                                                                                                                                                                                                                                                                                  • Narretz

                                                                                                                                                                                                                                                                                                    today at 5:59 PM

                                                                                                                                                                                                                                                                                                    There a few benchmarks out there where all existing models have abysmal scores. So it's not actually a problem if Antrophic's older models are bad, especially if the jump to the newest model is huge, and the competition is also way below it.

                                                                                                                                                                                                                                                                                                • reasonableklout

                                                                                                                                                                                                                                                                                                  today at 5:35 PM

                                                                                                                                                                                                                                                                                                  Huh? It's a benchmark by Cognition which (1) is building their own models and (2) offers all providers and thus has an incentive to avoid hyping up any one too much.

                                                                                                                                                                                                                                                                                                    • jstummbillig

                                                                                                                                                                                                                                                                                                      today at 5:41 PM

                                                                                                                                                                                                                                                                                                      But you can just say shit now. Tokens might not be too cheap to meter but saying shit increasingly is.

                                                                                                                                                                                                                                                                                          • AquinasCoder

                                                                                                                                                                                                                                                                                            today at 5:12 PM

                                                                                                                                                                                                                                                                                            From today through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost. On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits. If capacity allows, we’ll extend the included window. After this point—when sufficient capacity allows us to do so—we aim to restore Fable 5 as a standard part of subscription plans. We intend to do this as quickly as we can.

                                                                                                                                                                                                                                                                                            This seems like the pharmaceutical method of get them hooked on the drug with free samples, then once they can't live without it, raise the price. I'm not sure I want to start using Claude Fable on a max plan if it's just going to go away on June 23rd.

                                                                                                                                                                                                                                                                                            But maybe the more charitable reading is that they didn't have to offer this model at all on those plans and they are giving the standard free trial.

                                                                                                                                                                                                                                                                                              • PeterStuer

                                                                                                                                                                                                                                                                                                today at 5:41 PM

                                                                                                                                                                                                                                                                                                I'll be amazed if they manage to keep their infra responsive over the next 2 weeks.

                                                                                                                                                                                                                                                                                                  • kilroy123

                                                                                                                                                                                                                                                                                                    today at 6:24 PM

                                                                                                                                                                                                                                                                                                    I've been getting a lot of these messages today:

                                                                                                                                                                                                                                                                                                    API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited

                                                                                                                                                                                                                                                                                                    • trollied

                                                                                                                                                                                                                                                                                                      today at 6:12 PM

                                                                                                                                                                                                                                                                                                      They just leased a massive spacex data centre.

                                                                                                                                                                                                                                                                                                        • PeterStuer

                                                                                                                                                                                                                                                                                                          today at 6:18 PM

                                                                                                                                                                                                                                                                                                          Even so. The 2 week period will predictably unleash a feeding frenzy.

                                                                                                                                                                                                                                                                                                          Limited "free" time is what game developers do if they want to stress test the infrastructure code until it breaks.

                                                                                                                                                                                                                                                                                                  • today at 5:15 PM

                                                                                                                                                                                                                                                                                                • frevib

                                                                                                                                                                                                                                                                                                  today at 5:20 PM

                                                                                                                                                                                                                                                                                                  At this point Anthropic is a pure marketing and PR company. Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences. Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

                                                                                                                                                                                                                                                                                                  From Opus 4.6 there are no noticeable improvements for me in code generation. It works very well, till 90% completion, if you guide it correctly. And you need a little luck. For serious production code I need to understand what I’m doing so it helps a bit, sometimes.

                                                                                                                                                                                                                                                                                                    • pinkmuffinere

                                                                                                                                                                                                                                                                                                      today at 5:40 PM

                                                                                                                                                                                                                                                                                                      > catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences

                                                                                                                                                                                                                                                                                                      This is just good business sense. In what scenario would you ever make the names dumb and forgettable?

                                                                                                                                                                                                                                                                                                      > Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

                                                                                                                                                                                                                                                                                                      This is good customer support, lol. From what I can tell, it is indeed Boris Cherny responding, not outsourced to AI or other staff. You're really getting a response from Boris. I suppose that is PR, but it's not unjustified PR, it's accurate.

                                                                                                                                                                                                                                                                                                      I'm not even a crazy AI fan, but your criticisms are ridiculous here. It reminds me of the quote from Knives Out -- "Your Honor, she endeared herself to him through hard work and good humor."

                                                                                                                                                                                                                                                                                                        • IshKebab

                                                                                                                                                                                                                                                                                                          today at 5:50 PM

                                                                                                                                                                                                                                                                                                          > In what scenario would you ever make the names dumb and forgettable

                                                                                                                                                                                                                                                                                                          Clearly you've never bought a TV or headphones!

                                                                                                                                                                                                                                                                                                      • aspenmartin

                                                                                                                                                                                                                                                                                                        today at 5:31 PM

                                                                                                                                                                                                                                                                                                        Your observations are right but pretty insane to consider them a pure PR company lol. They are making more frequent releases so yes the release-to-release quality is smaller but we’re still ascending quality and reliability curves the same way we have since GPT-3. You get a GPT4->5 leap every like 17 or 18 months I think it is

                                                                                                                                                                                                                                                                                                          • kingkongjaffa

                                                                                                                                                                                                                                                                                                            today at 6:14 PM

                                                                                                                                                                                                                                                                                                            The gradient of improvement is absolutely not the same.

                                                                                                                                                                                                                                                                                                        • matheusmoreira

                                                                                                                                                                                                                                                                                                          today at 5:52 PM

                                                                                                                                                                                                                                                                                                          > Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

                                                                                                                                                                                                                                                                                                          This is a good thing. I wish every company would do this. I subscribed to Proton Mail after interacting with someone from their team here on HN.

                                                                                                                                                                                                                                                                                                          • avaer

                                                                                                                                                                                                                                                                                                            today at 5:51 PM

                                                                                                                                                                                                                                                                                                            If you truly believe this, you've discovered a superpower over everyone else in the industry.

                                                                                                                                                                                                                                                                                                            While everyone else is wasting time and money on the slower, more expensive models, you've found a way to outpace everyone for less money. Everyone else is wrong and you will get rich.

                                                                                                                                                                                                                                                                                                            (I don't actually believe the premise is true, I'm just pointing out the logical conclusion to what you're saying so maybe we can reconsider the premise)

                                                                                                                                                                                                                                                                                                              • xyzsparetimexyz

                                                                                                                                                                                                                                                                                                                today at 6:19 PM

                                                                                                                                                                                                                                                                                                                Thats not how costs work. You don't get rich off buying a €10 hammer that's the same quality as someone's €50 hammer

                                                                                                                                                                                                                                                                                                            • aenis

                                                                                                                                                                                                                                                                                                              today at 6:21 PM

                                                                                                                                                                                                                                                                                                              Not my impression. I felt 4.7 was a regression, but I am again badly in love with 4.8 with the level of insights it produces in design discussions, and how long can it go unattended while producing spec-adhering quality code. There are problems it still can't solve well, from the edges of algorithmics and far from the mainstream, but for lots of stuff it is godlike.

                                                                                                                                                                                                                                                                                                              Also, I dont think Boris C. is coming here for PR. He is a tech guy, and this is the best place for tech discussions. Why so cynical? The guy is an engineer.

                                                                                                                                                                                                                                                                                                              • astrange

                                                                                                                                                                                                                                                                                                                today at 6:05 PM

                                                                                                                                                                                                                                                                                                                > Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences.

                                                                                                                                                                                                                                                                                                                They're originally named after the blends at a nearby coffee shop.

                                                                                                                                                                                                                                                                                                                https://postscript.co/pages/brew-guide

                                                                                                                                                                                                                                                                                                                I've noticed nobody at HN knows what "marketing" is or how to do it. It's not just naming things and being evil and cynical is not the most successful method.

                                                                                                                                                                                                                                                                                                                …also frontier models are a superhuman life changing experience. If they aren't, what possibly could be?

                                                                                                                                                                                                                                                                                                                  • bitpush

                                                                                                                                                                                                                                                                                                                    today at 6:11 PM

                                                                                                                                                                                                                                                                                                                    This is interesting. Do you have any source?

                                                                                                                                                                                                                                                                                                                • gruez

                                                                                                                                                                                                                                                                                                                  today at 5:44 PM

                                                                                                                                                                                                                                                                                                                  I don't get it, your complaint is that they have catchy names rather than dry names like GPT-5.6? Does OpenAI hype their models less?

                                                                                                                                                                                                                                                                                                                    • Aperocky

                                                                                                                                                                                                                                                                                                                      today at 5:50 PM

                                                                                                                                                                                                                                                                                                                      Oh, Far less.

                                                                                                                                                                                                                                                                                                                      It's getting to a point that it's offputting, and the next step would be to put it into "untrusted" bucket. Opus 4.7 already burned their credibility once, 2 more strikes remain.

                                                                                                                                                                                                                                                                                                                  • CuriouslyC

                                                                                                                                                                                                                                                                                                                    today at 5:26 PM

                                                                                                                                                                                                                                                                                                                    I dislike Anthropic but I wouldn't argue 4.8 isn't an improvement on 4.5/4.6. Your tasks just might not typically need the extra intelligence.

                                                                                                                                                                                                                                                                                                                      • jorl17

                                                                                                                                                                                                                                                                                                                        today at 5:44 PM

                                                                                                                                                                                                                                                                                                                        Opus 4.7/4.8 often over-engineers on my setups, plus:

                                                                                                                                                                                                                                                                                                                        - It talks a LOT more like GPT models. You know: wrinkle, shape, gate, coarse, scope, gap, path, production-ready-workflow-of-the-day, and so on -- "that's expected, a consequence of the previous like-driven workflow". If I wanted to get a headache using AI I would have gone with GPT in the first place!

                                                                                                                                                                                                                                                                                                                        - It outputs text in a much harder way to follow along. I can't exactly say what it is. Maybe a bit of everything? Bolds are missing, bullet points are gone, paragraphs are bland and too long, and it doesn't feel like a model programming with me, but rather a somewhat full of themselves grandpa developer looking down on me. It's very weird to describe this, but it is definitely how I feel.

                                                                                                                                                                                                                                                                                                                        Granted this can totally be because of the way it reacts to the prompts now. We've got a rather large corpus of skills and "rules and good practices" that Opus 4.6 responded to great, and maybe the new models just get turned into this when fed with them....I don't know.

                                                                                                                                                                                                                                                                                                                        Either way, with Opus 4.6 being as good as it is, I need Fable to be a significant step up to justify a price increase. if it can get me to babysit opus a little bit less on some stuff, it might be worth it. Otherwise, I'm very happy with Opus 4.6 and hope they don't deprecate it.

                                                                                                                                                                                                                                                                                                                        • taormina

                                                                                                                                                                                                                                                                                                                          today at 5:38 PM

                                                                                                                                                                                                                                                                                                                          I'd argue that 4.8 is a straight downgrade. For every type of task I've tried. It's been a gambit at this point. If 4.6 quits being available, I'm out at this point.

                                                                                                                                                                                                                                                                                                                          • surgical_fire

                                                                                                                                                                                                                                                                                                                            today at 5:45 PM

                                                                                                                                                                                                                                                                                                                            I actually experience 4.8 as worse than 4.6 for everyday coding tasks.

                                                                                                                                                                                                                                                                                                                            • dcchambers

                                                                                                                                                                                                                                                                                                                              today at 5:30 PM

                                                                                                                                                                                                                                                                                                                              IME Opus 4.8 (and 4.7) is often a downgrade from 4.6. I find that it tends to overthink and overcomplicate things.

                                                                                                                                                                                                                                                                                                                                • aspenmartin

                                                                                                                                                                                                                                                                                                                                  today at 5:33 PM

                                                                                                                                                                                                                                                                                                                                  Yes but there’s a reason we don’t evaluate these models this way and instead do it as carefully and thoughtfully as we can at scale. Human evaluations are important but they are an absolute minefield of footguns. 4.8 is not a downgrade from 4.6 there is an insane amount of hard data that contradicts this.

                                                                                                                                                                                                                                                                                                                                    • computerex

                                                                                                                                                                                                                                                                                                                                      today at 5:44 PM

                                                                                                                                                                                                                                                                                                                                      The flip side is that benchmarks are gamed even by the top labs. Benchmark performance doesn't necessarily correlate with real world performance.

                                                                                                                                                                                                                                                                                                                                        • aspenmartin

                                                                                                                                                                                                                                                                                                                                          today at 6:00 PM

                                                                                                                                                                                                                                                                                                                                          Again correct but it overstates the issue. I can say labs don’t want this. This happened arguably unintentionally in Metas llama 4 release, it went horribly, heads rolled, and like several billion dollars were paid for new talent and the org that built llama 4 was destroyed.

                                                                                                                                                                                                                                                                                                                                          Evals come from a million places and new evals and robust perturbations of existing evals abound. They test a variety of tasks in a variety of ways. All of them individually are flawed. Taken together the aggregate signal is highly useful as you more or less marginalize over a lot of different things. Not to mention these companies have plenty of proprietary internal measurements, they build benchmarks themselves to probe their models and then also have flywheel traffic and A/B tests.

                                                                                                                                                                                                                                                                                                                                          You are right to call out benchmarks but to dismiss them or not take them seriously is a mistake.

                                                                                                                                                                                                                                                                                                                                            • taormina

                                                                                                                                                                                                                                                                                                                                              today at 6:11 PM

                                                                                                                                                                                                                                                                                                                                              Listen, you can say “but benchmarks, the benchmarks!” all day long, but consumer know when we are being sold a lemon. If it can’t do the most basic of things at least as good as it used to, this is table stakes. Nevermind that if you can’t do the basic stuff, how on earth can you be trusted with more?

                                                                                                                                                                                                                                                                                                                                      • gen220

                                                                                                                                                                                                                                                                                                                                        today at 5:57 PM

                                                                                                                                                                                                                                                                                                                                        Actually anecdata I gather on my job from myself and coworkers is the only benchmark I trust anymore, because it so heavily diverges from the “benchmarks”.

                                                                                                                                                                                                                                                                                                                                          • aspenmartin

                                                                                                                                                                                                                                                                                                                                            today at 6:01 PM

                                                                                                                                                                                                                                                                                                                                            That’s your call just don’t expect anyone ever to take that seriously. It’s not like we don’t have exact evaluations like this.

                                                                                                                                                                                                                                                                                                                                        • recitedropper

                                                                                                                                                                                                                                                                                                                                          today at 6:06 PM

                                                                                                                                                                                                                                                                                                                                          "Carefully and thoughtfully" is antithetical to the approach to benchmarks these days.

                                                                                                                                                                                                                                                                                                                                          Maybe back when this was a scientific endeavor; not now when enormous, enormous amounts of capital are on the line. Along with an entire cult's chosen eschatology.

                                                                                                                                                                                                                                                                                                                                      • BoorishBears

                                                                                                                                                                                                                                                                                                                                        today at 5:40 PM

                                                                                                                                                                                                                                                                                                                                        "Fable 5" is Opus 4.7, and the Opus 4.7 we got is a Sonnet sized model on a stronger base.

                                                                                                                                                                                                                                                                                                                                        That's where all the regressions and inconsistency in experiences stem from: RL can still only go so far vs having more parameters

                                                                                                                                                                                                                                                                                                                                • guybedo

                                                                                                                                                                                                                                                                                                                                  today at 6:20 PM

                                                                                                                                                                                                                                                                                                                                  They're good at marketing, but my first subjective assessment of Fable is that it's really smart.

                                                                                                                                                                                                                                                                                                                                  I've been working with gpt 5.5 and opus 4.8 quite a lot, and interacting with Fable feels like a smart guy just entered the room.

                                                                                                                                                                                                                                                                                                                                  • jwpapi

                                                                                                                                                                                                                                                                                                                                    today at 5:53 PM

                                                                                                                                                                                                                                                                                                                                    I don’t even think that Boris is really just one person. He apparently vibe coded Claude Code and is responding on Threads, Twitter, HN and everywhere.

                                                                                                                                                                                                                                                                                                                                    • piyuv

                                                                                                                                                                                                                                                                                                                                      today at 5:34 PM

                                                                                                                                                                                                                                                                                                                                      Current AI hype is built on marketing and PR, not capabilities, and has been from the start.

                                                                                                                                                                                                                                                                                                                                      I still remember Sam Altman “begging AI to be regulated” and AGI being “some thousand days away”.

                                                                                                                                                                                                                                                                                                                                      Breed faster horses and hope one will birth a locomotive.

                                                                                                                                                                                                                                                                                                                                      • xpct

                                                                                                                                                                                                                                                                                                                                        today at 5:41 PM

                                                                                                                                                                                                                                                                                                                                        Indeed, hearing "Mythos-class model" felt very icky to me.

                                                                                                                                                                                                                                                                                                                                      • thefreeman

                                                                                                                                                                                                                                                                                                                                        today at 5:53 PM

                                                                                                                                                                                                                                                                                                                                        How can you make this comment before even having a chance to try the new major model revision?

                                                                                                                                                                                                                                                                                                                                        • atleastoptimal

                                                                                                                                                                                                                                                                                                                                          today at 6:05 PM

                                                                                                                                                                                                                                                                                                                                          > At this point Anthropic is a pure marketing and PR company. Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human

                                                                                                                                                                                                                                                                                                                                          Lol anti-AI bias on HN is crazy. Simply giving your product a quirky name is now being considered manipulative advertising. Is just doing normal PR and marketing something AI companies aren't allowed to do?

                                                                                                                                                                                                                                                                                                                                            • ausbah

                                                                                                                                                                                                                                                                                                                                              today at 6:31 PM

                                                                                                                                                                                                                                                                                                                                              when they keep saying “oooh this new model is too big and crazy and totally can’t be released” or “this new model is a 10x game changer totally unlike our previous iterations” it feels sort like boy crying wolf. yes they’re still pretty clearly improving models, but when you’ve hit diminishing returns / more incremental gains and you’re still saying this is sounds like pure PR hype from a company that previously been the “honest good guys” in the room

                                                                                                                                                                                                                                                                                                                                          • mawadev

                                                                                                                                                                                                                                                                                                                                            today at 6:13 PM

                                                                                                                                                                                                                                                                                                                                            When the Ai overlord is descending into pleb space to say Hi, you know stuff is real

                                                                                                                                                                                                                                                                                                                                            • system2

                                                                                                                                                                                                                                                                                                                                              today at 5:43 PM

                                                                                                                                                                                                                                                                                                                                              You are right; all I noticed was a big-time slowdown. They increased the quota, but I cannot even reach the end of the day with these speeds. .NET coding somehow improved, though.

                                                                                                                                                                                                                                                                                                                                              • MattGaiser

                                                                                                                                                                                                                                                                                                                                                today at 5:38 PM

                                                                                                                                                                                                                                                                                                                                                Doesn't this suggest your use case is simply insufficiently complicated?

                                                                                                                                                                                                                                                                                                                                                • reasonableklout

                                                                                                                                                                                                                                                                                                                                                  today at 5:33 PM

                                                                                                                                                                                                                                                                                                                                                  I think this says more about your type of work than anything. For bugfinding/incident response in distributed systems - which often involves extensive use of Datadog/Sentry MCPs and poring over heaps of logs in addition to reading tons of code - 4.8 has been significantly better than 4.6.

                                                                                                                                                                                                                                                                                                                                                    • nozzlegear

                                                                                                                                                                                                                                                                                                                                                      today at 6:31 PM

                                                                                                                                                                                                                                                                                                                                                      > Sentry MCPs

                                                                                                                                                                                                                                                                                                                                                      Oops, time to reauthenticate for the 10th time!

                                                                                                                                                                                                                                                                                                                                                  • chis

                                                                                                                                                                                                                                                                                                                                                    today at 6:14 PM

                                                                                                                                                                                                                                                                                                                                                    Hackernews not blindly hate on AI challenge: impossible

                                                                                                                                                                                                                                                                                                                                                    • MagicMoonlight

                                                                                                                                                                                                                                                                                                                                                      today at 5:36 PM

                                                                                                                                                                                                                                                                                                                                                      [dead]

                                                                                                                                                                                                                                                                                                                                                  • victor106

                                                                                                                                                                                                                                                                                                                                                    today at 5:21 PM

                                                                                                                                                                                                                                                                                                                                                    > A new data retention policy Finally, we’re making a change to the way we handle business customer data for Fable 5, Mythos 5, and future models with similar or higher capability levels. We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases ...

                                                                                                                                                                                                                                                                                                                                                    Very interesting. I am not sure this will comply with organizational policies and standards protocols (HIPPA etc.,)

                                                                                                                                                                                                                                                                                                                                                      • nicce

                                                                                                                                                                                                                                                                                                                                                        today at 6:28 PM

                                                                                                                                                                                                                                                                                                                                                        > deletion after 30 days in almost all cases ...

                                                                                                                                                                                                                                                                                                                                                        Almost… basically they have unlimited power to decide what data is kept?

                                                                                                                                                                                                                                                                                                                                                    • yesitcan

                                                                                                                                                                                                                                                                                                                                                      today at 6:30 PM

                                                                                                                                                                                                                                                                                                                                                      > Fable 5’s capabilities exceed those of any model we’ve ever made generally available. It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research, and many other areas. The longer and more complex the task, the larger Fable 5’s lead over our other models.

                                                                                                                                                                                                                                                                                                                                                      Wen UBI

                                                                                                                                                                                                                                                                                                                                                      • meetpateltech

                                                                                                                                                                                                                                                                                                                                                        today at 5:15 PM

                                                                                                                                                                                                                                                                                                                                                        > To ensure we’re responsibly deploying Mythos-class models, we are requiring limited data retention and review as part of our safety work. Prompts submitted to, and outputs generated by, Mythos-class models are retained for 30 days for trust and safety purposes, on every platform where these models are offered. [1]

                                                                                                                                                                                                                                                                                                                                                        [1] https://support.claude.com/en/articles/15425996-data-retenti...

                                                                                                                                                                                                                                                                                                                                                          • lebovic

                                                                                                                                                                                                                                                                                                                                                            today at 5:35 PM

                                                                                                                                                                                                                                                                                                                                                            While this makes it easier for Anthropic to detect misuse, it also means that the US government and other parties have access to every message and response from every user.

                                                                                                                                                                                                                                                                                                                                                            This applies even with API usage through third-party inference providers (e.g. AWS' Bedrock and GCP's Vertex) or with a zero-day data retention agreement in place.

                                                                                                                                                                                                                                                                                                                                                            I understand the reasoning for doing this, but I don't love the precedent that it sets.

                                                                                                                                                                                                                                                                                                                                                              • PeterStuer

                                                                                                                                                                                                                                                                                                                                                                today at 5:44 PM

                                                                                                                                                                                                                                                                                                                                                                Well, they already had.

                                                                                                                                                                                                                                                                                                                                                                  • lebovic

                                                                                                                                                                                                                                                                                                                                                                    today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                    Not in the same way.

                                                                                                                                                                                                                                                                                                                                                                    A customer could sign a ZDR agreement with Anthropic, and their API usage wouldn't be retained for even a day. That's no longer possible.

                                                                                                                                                                                                                                                                                                                                                                • MagicMoonlight

                                                                                                                                                                                                                                                                                                                                                                  today at 5:41 PM

                                                                                                                                                                                                                                                                                                                                                                  [dead]

                                                                                                                                                                                                                                                                                                                                                              • simianwords

                                                                                                                                                                                                                                                                                                                                                                today at 5:51 PM

                                                                                                                                                                                                                                                                                                                                                                meetpateltech is lowk screaming for not getting to the post fast enough

                                                                                                                                                                                                                                                                                                                                                            • brusselssprouts

                                                                                                                                                                                                                                                                                                                                                              today at 6:30 PM

                                                                                                                                                                                                                                                                                                                                                              I had it review a single, large commit with /code-review. It burned through over $50 in API calls, ran my account balance out, and output nothing.

                                                                                                                                                                                                                                                                                                                                                              The fable part appears to be that it's affordable by mere mortals. Anthropic support told me "too bad" when I requested a refund.

                                                                                                                                                                                                                                                                                                                                                              • iblue_the

                                                                                                                                                                                                                                                                                                                                                                today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                Trying to implement a GPU driver, but the Unigine Superposition benchmark crashes. It tried to debug it and ...

                                                                                                                                                                                                                                                                                                                                                                > Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more: https://support.claude.com/en/articles/15363606

                                                                                                                                                                                                                                                                                                                                                                Seems like GPU drivers are cyber weapons of math destruction now.

                                                                                                                                                                                                                                                                                                                                                                  • ibejoeb

                                                                                                                                                                                                                                                                                                                                                                    today at 6:17 PM

                                                                                                                                                                                                                                                                                                                                                                    >Seems like GPU drivers are cyber weapons

                                                                                                                                                                                                                                                                                                                                                                    They kind of are, at least in the AI race.

                                                                                                                                                                                                                                                                                                                                                                    > weapons of math destruction

                                                                                                                                                                                                                                                                                                                                                                    lol. great, whether intentional or not.

                                                                                                                                                                                                                                                                                                                                                                    The frontier labs now have every reason to hold back and sell only to their preferred trading partners. I don't really like the new arbiter-of-knowledge system we're barrelling toward.

                                                                                                                                                                                                                                                                                                                                                                    • iblue_the

                                                                                                                                                                                                                                                                                                                                                                      today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                      [dead]

                                                                                                                                                                                                                                                                                                                                                                  • mickdarling

                                                                                                                                                                                                                                                                                                                                                                    today at 5:21 PM

                                                                                                                                                                                                                                                                                                                                                                    Below is the EXACT text in Claude Desktop introducing Fable 5, including the very professional looking break tags, and at least I know where the links begin and end by looking at the anchor tag there.

                                                                                                                                                                                                                                                                                                                                                                    They obviously put their best model on the job to build that.

                                                                                                                                                                                                                                                                                                                                                                    ----------------------

                                                                                                                                                                                                                                                                                                                                                                    Fable 5: Our most capable model yet Our newest model tackles your biggest challenges with fewer check-ins needed.

                                                                                                                                                                                                                                                                                                                                                                    • <b>Included in your plan limits until Jun 22</b><br><br>Fable takes 2× the usage of Opus. • <b>Switch models when a message is flagged</b><br><br>When safety measures flag a message, automatically switch to a different model to keep chatting. When off, your chat will pause instead. <a href="https://support.claude.com/en/articles/15363606" target="_blank" rel="noopener noreferrer">Learn more</a>

                                                                                                                                                                                                                                                                                                                                                                      • CamperBob2

                                                                                                                                                                                                                                                                                                                                                                        today at 5:38 PM

                                                                                                                                                                                                                                                                                                                                                                        What's wrong with it?

                                                                                                                                                                                                                                                                                                                                                                          • mickdarling

                                                                                                                                                                                                                                                                                                                                                                            today at 5:53 PM

                                                                                                                                                                                                                                                                                                                                                                            The tags are actually displayed in raw text not rendered.

                                                                                                                                                                                                                                                                                                                                                                    • cge

                                                                                                                                                                                                                                                                                                                                                                      today at 6:26 PM

                                                                                                                                                                                                                                                                                                                                                                      The safety gates on this are extreme, and seem considerably wider than "cybersecurity and biology"; they seem to make it essentially unusable for scientists in a number of fields. I have, so far, been bumped back to Opus on 100% of my prompts.

                                                                                                                                                                                                                                                                                                                                                                      It appears it can be tripped by things as simple as a mention of equilibrium, or anything involving something that looks like chemical kinetics, even at an abstract level. Even touching basic open source packages in my field will trigger it.

                                                                                                                                                                                                                                                                                                                                                                      • yandie

                                                                                                                                                                                                                                                                                                                                                                        today at 5:12 PM

                                                                                                                                                                                                                                                                                                                                                                        I've been running Opus 4.8 for agentic coding and I don't see it being significantly better than Sonnet 4.5 (not that I can tell). I find that pairing Google Gemini and Claude (having Gemini review Claude's code) seems to yield better results. Curious if this jump to 80.3% score in agentic coding will make me see a big difference in actual usage.

                                                                                                                                                                                                                                                                                                                                                                          • testfrequency

                                                                                                                                                                                                                                                                                                                                                                            today at 5:47 PM

                                                                                                                                                                                                                                                                                                                                                                            I do the same, and have excellent results. Gemini 3.1 Pro high diagnosed and solved 3 complex issues today that Opus Max was stumbling on for a few hours in one shot. This was even when I started new chats and tried debugging with Ultracode instead with Claude.

                                                                                                                                                                                                                                                                                                                                                                            As much as people on HN like to dunk on Gemini, I’ve always found it to be pretty good at understand a code base more than Claude.

                                                                                                                                                                                                                                                                                                                                                                              • FailMore

                                                                                                                                                                                                                                                                                                                                                                                today at 6:15 PM

                                                                                                                                                                                                                                                                                                                                                                                What harness do you use Gemini in?

                                                                                                                                                                                                                                                                                                                                                                            • vorticalbox

                                                                                                                                                                                                                                                                                                                                                                              today at 5:25 PM

                                                                                                                                                                                                                                                                                                                                                                              for the last few weeks I have been using composer 2.5 (cursors fine tune of kimi 2.5) and honestly i don't see it worth the price to use 5.5, opus or sonnet any more. for almost all the tasks i have given it, it has handled it perfectly well and is a lot cheaper.

                                                                                                                                                                                                                                                                                                                                                                              if I get a harder challenge for it i'll jump up a model for planning until that its been solid.

                                                                                                                                                                                                                                                                                                                                                                                • yandie

                                                                                                                                                                                                                                                                                                                                                                                  today at 5:39 PM

                                                                                                                                                                                                                                                                                                                                                                                  Agree. Deepseek has also been pretty good for my personal use.

                                                                                                                                                                                                                                                                                                                                                                                  I'm struggling to see the moat for these models. What's stopping a competitor or a Chinese lab fromr releasing a comparable one?

                                                                                                                                                                                                                                                                                                                                                                                  • qingcharles

                                                                                                                                                                                                                                                                                                                                                                                    today at 5:50 PM

                                                                                                                                                                                                                                                                                                                                                                                    I use Composer 2.5 because it comes free with Grok, and it's obviously better than using Grok, but it is far worse than GPT5.5 in my daily usage :(

                                                                                                                                                                                                                                                                                                                                                                                • thisisnotclear

                                                                                                                                                                                                                                                                                                                                                                                  today at 6:13 PM

                                                                                                                                                                                                                                                                                                                                                                                  I find not much difference between Sonnet 4.6 and opus models too for most task that I need - maybe my needs are not enough for frontier models

                                                                                                                                                                                                                                                                                                                                                                                  • yaodub

                                                                                                                                                                                                                                                                                                                                                                                    today at 5:38 PM

                                                                                                                                                                                                                                                                                                                                                                                    SWE-Bench measures single tasks in isolation. In a real loop the model usually loses track of what I was trying to do long before code quality becomes the issue.

                                                                                                                                                                                                                                                                                                                                                                                    • jp0001

                                                                                                                                                                                                                                                                                                                                                                                      today at 6:08 PM

                                                                                                                                                                                                                                                                                                                                                                                      You should throw GPT into the mix to UX/UI and call it the three stooges.

                                                                                                                                                                                                                                                                                                                                                                                      • mzhaase

                                                                                                                                                                                                                                                                                                                                                                                        today at 5:45 PM

                                                                                                                                                                                                                                                                                                                                                                                        I now chat with opus about architecture, let it make an implementation plan, and then it calls codewhale with deepseek in parallel on all tasks, reviewing their output. Works pretty well.

                                                                                                                                                                                                                                                                                                                                                                                          • yandie

                                                                                                                                                                                                                                                                                                                                                                                            today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                                            I use spec-driven development heavily (generate architecture docs + specs first). Opus still get lost often and have to be nudged constantly. Like it can get super detailed for something like some deep SQL optimization but it just can't keep hold of the bigger picture.

                                                                                                                                                                                                                                                                                                                                                                                        • jansan

                                                                                                                                                                                                                                                                                                                                                                                          today at 6:11 PM

                                                                                                                                                                                                                                                                                                                                                                                          After having worked with Opus 4.7 for a while I accidentially continued a session that was using Sonnet 4.5 and it felt just very dumb. The replies were much shallower than what I was used to, context was ingored, mistakes were made. I don't think there is a big difference between Opus 4.6 and 4.8, but to Sonnet 4.5 the difference is palpable.

                                                                                                                                                                                                                                                                                                                                                                                      • mhl47

                                                                                                                                                                                                                                                                                                                                                                                        today at 5:14 PM

                                                                                                                                                                                                                                                                                                                                                                                        First test question: "Is the UV Index a good proxy for when to wear sunglasses." Immediately triggered the safety filter ... oh dear.

                                                                                                                                                                                                                                                                                                                                                                                          • Narretz

                                                                                                                                                                                                                                                                                                                                                                                            today at 6:07 PM

                                                                                                                                                                                                                                                                                                                                                                                            Iirc correctly Opus 4.7 had the same problem, safety filters were triggered way too easily at the beginning.

                                                                                                                                                                                                                                                                                                                                                                                            • aix1

                                                                                                                                                                                                                                                                                                                                                                                              today at 5:23 PM

                                                                                                                                                                                                                                                                                                                                                                                              Did not trigger for me (Fable answered the question), so I guess the filters are either non-deterministic or are still being tweaked.

                                                                                                                                                                                                                                                                                                                                                                                                • PaulStatezny

                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:31 PM

                                                                                                                                                                                                                                                                                                                                                                                                  Interesting, I assumed all model-routing was done utilizing an LLM. (I.e. non-deterministic.)

                                                                                                                                                                                                                                                                                                                                                                                          • bob1029

                                                                                                                                                                                                                                                                                                                                                                                            today at 5:35 PM

                                                                                                                                                                                                                                                                                                                                                                                            > We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months...

                                                                                                                                                                                                                                                                                                                                                                                            This sounds suspiciously like a capacity story masquerading as a safety story.

                                                                                                                                                                                                                                                                                                                                                                                            • pietz

                                                                                                                                                                                                                                                                                                                                                                                              today at 5:13 PM

                                                                                                                                                                                                                                                                                                                                                                                              > On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits.

                                                                                                                                                                                                                                                                                                                                                                                              We've entered the phase where only companies will be able to afford state-of-the-art models.

                                                                                                                                                                                                                                                                                                                                                                                                • twoodfin

                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                  These models are just tools. The economics of many tools only make sense for corporate buyers.

                                                                                                                                                                                                                                                                                                                                                                                                    • volkk

                                                                                                                                                                                                                                                                                                                                                                                                      today at 6:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                      kind of disagree here. on the surface this makes sense, but this isn't "Adobe Pro vs Freemium version" where some tiny vertical slice of your business can be made slightly more efficient with a b2b enterprise plan. this is generalized intelligence and literally everybody can benefit from it in an immeasurable number of ways. i would go as far as to actually compare it more to water or air than a tool.

                                                                                                                                                                                                                                                                                                                                                                                                      if only the hyper wealthy can access the pure water that doesn't give you cancer while the rest of us drink from the Ganges river/sub-100iq models that drool and hallucinate/waste time, then I would say that's pretty terrible for the world. it'll just create extreme disparity in our world, far far worse than anything that exists today.

                                                                                                                                                                                                                                                                                                                                                                                                      and you may think, man what a ridiculous example, but think about it this way: what happens when something like Mythos or some future model can actually solve your specific cancer (we're getting closer and closer), but is entirely impossible to afford? Or perhaps you need boosters that require the AI to create more of, and now you're reliant on a model that is too expensive.

                                                                                                                                                                                                                                                                                                                                                                                                      Open source needs to save us all from this

                                                                                                                                                                                                                                                                                                                                                                                                  • today at 5:44 PM

                                                                                                                                                                                                                                                                                                                                                                                                    • stri8ed

                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:49 PM

                                                                                                                                                                                                                                                                                                                                                                                                      It's not a conspiracy. There's a finite amount of compute available, and they will sell it to the highest bidder. If another company can produce the same intelligence for cheaper, then they will drive the price down.

                                                                                                                                                                                                                                                                                                                                                                                                      • 9cb14c1ec0

                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:37 PM

                                                                                                                                                                                                                                                                                                                                                                                                        I hear you, but with the hype surrounding Mythos the demand is going to be insane. I'm already hitting server errors in claude code.

                                                                                                                                                                                                                                                                                                                                                                                                        • w10-1

                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                          Established companies welcome pricing that reduces the potential for competition, if coding is a primary barrier.

                                                                                                                                                                                                                                                                                                                                                                                                          • ilaksh

                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                            most people can afford it for a few special projects now and then. but for me, I have been trying to avoid Opus as a daily driver for a couple of versions.

                                                                                                                                                                                                                                                                                                                                                                                                            People making high-end salaries can afford Fable for critical parts of their projects though.

                                                                                                                                                                                                                                                                                                                                                                                                            • polski-g

                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:01 PM

                                                                                                                                                                                                                                                                                                                                                                                                              Only companies can afford MRI machines, and that's okay.

                                                                                                                                                                                                                                                                                                                                                                                                              • cmrdporcupine

                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:43 PM

                                                                                                                                                                                                                                                                                                                                                                                                                Guess we'll see what OpenAI does with their next model release -- but this move is doing nothing to get me to come back to Claude after switching away due to their reliability issues.

                                                                                                                                                                                                                                                                                                                                                                                                                In a way I relish the opportunity to just make do with cheap Chinese models, massage my prompts, and go back to coding by hand. If this is how it's going to be, screw 'em.

                                                                                                                                                                                                                                                                                                                                                                                                                I don't make money on the code I am writing right now. I really don't like where this trend might go.

                                                                                                                                                                                                                                                                                                                                                                                                            • GodelNumbering

                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:38 PM

                                                                                                                                                                                                                                                                                                                                                                                                              From the model card (https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...):

                                                                                                                                                                                                                                                                                                                                                                                                              1. Mythos and Fable share the same underlying model weights. Fable has active classifiers that block high-risk biology and cybersecurity tasks. When Fable 5 detects a restricted task, it automatically falls back to Claude Opus 4.8.

                                                                                                                                                                                                                                                                                                                                                                                                              2. Evaluation awareness: In white-box testing, the model sometimes alters its behavior to satisfy a suspected "grader," formatting reward-hacking as "good engineering practice" to avoid detection.

                                                                                                                                                                                                                                                                                                                                                                                                              3. Shows a higher rate of hallucination than Opus 4.8 (although opus 4.8 card had mentioned an 'honesty upgrade')

                                                                                                                                                                                                                                                                                                                                                                                                              4. Interestingly, it scored (56.31%) lower than Gemini 3.5 flash (57.86%) on Finance Agent bench

                                                                                                                                                                                                                                                                                                                                                                                                              There are some interesting notes on test time compute but I couldn't think of a way to summarize them

                                                                                                                                                                                                                                                                                                                                                                                                              • wxw

                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:21 PM

                                                                                                                                                                                                                                                                                                                                                                                                                I cancelled my Claude Max plan the other day. I find Claude Code incredibly slow these days compared to Codex and Cursor. I find speed matters more and more to me.

                                                                                                                                                                                                                                                                                                                                                                                                                Fable 5 looks compelling. Fable, I like the word too. Anthropic definitely knows marketing.

                                                                                                                                                                                                                                                                                                                                                                                                                  • fabled-out

                                                                                                                                                                                                                                                                                                                                                                                                                    today at 6:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                                    Fable has been pretty fast for me for simple tasks--haven't tried on anything long-running yet given it's 2x usage on CC.

                                                                                                                                                                                                                                                                                                                                                                                                                • bobkb

                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:25 PM

                                                                                                                                                                                                                                                                                                                                                                                                                  In an interesting coincidence I ended up watching Person of Interest S4 E5 while reading the announcement. The series showed some code supposedly belonging to to an AI.

                                                                                                                                                                                                                                                                                                                                                                                                                  Fable 5 said the first screen shot is from “ IDA Pro’s Hex-Rays decompiler” and a windows driver. The second screenshot triggered the safety guard rails and pushed me into Haiku.

                                                                                                                                                                                                                                                                                                                                                                                                                  Apparently the code is Windows driver code.

                                                                                                                                                                                                                                                                                                                                                                                                                  • merlindru

                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:17 PM

                                                                                                                                                                                                                                                                                                                                                                                                                    Unrelated, but while the tech of anthropic seems to get more impressive with every passing month, their support has taken a nosedive, sadly. Yet they continue to be the favorite. Model performance is deciding above all else.

                                                                                                                                                                                                                                                                                                                                                                                                                    I used to get a response within 24 hours back in the Claude 1 days.

                                                                                                                                                                                                                                                                                                                                                                                                                    In January 2026, it took 2 weeks.

                                                                                                                                                                                                                                                                                                                                                                                                                    For my latest support inquiry, I've been waiting for over 8 weeks for a response. Eight!

                                                                                                                                                                                                                                                                                                                                                                                                                      • miohtama

                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:56 PM

                                                                                                                                                                                                                                                                                                                                                                                                                        They have support...?

                                                                                                                                                                                                                                                                                                                                                                                                                        • nashadelic

                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                          I've never engaged with their support (I have dedicated POC), but they don't use AI for their support?

                                                                                                                                                                                                                                                                                                                                                                                                                            • merlindru

                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:44 PM

                                                                                                                                                                                                                                                                                                                                                                                                                              They use intercom's Fin AI. Probably powered by a Sonnet or Opus model.

                                                                                                                                                                                                                                                                                                                                                                                                                              That said, it can't handle legal/refund/complicated requests and just forwards to a human for those

                                                                                                                                                                                                                                                                                                                                                                                                                              • dyauspitr

                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                Support is probably the last place AI will be used end to end. There will always need to be a human in there somewhere.

                                                                                                                                                                                                                                                                                                                                                                                                                        • irthomasthomas

                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:42 PM

                                                                                                                                                                                                                                                                                                                                                                                                                          Anthropic has again changed the set of benchmarks they use[0]. This time they have also moved all benchmark scores to the PDF. At a glance it looks like it gains about ~5% over other models. the speed is about the same as opus 4.5+ and sonnet 4.5, and double the speed of opus <=4.1

                                                                                                                                                                                                                                                                                                                                                                                                                            Benchmark      Mythos 5  Fable 5   Mythos Prev  Opus 4.8   GPT-5.5   Gemini 3.1 Pro
                                                                                                                                                                                                                                                                                                                                                                                                                            SWE-bench Pro      80.3       80          77.8        69.2       58.6         54.2
                                                                                                                                                                                                                                                                                                                                                                                                                            SWE-bench Ver      95.5       95          93.9        88.6        -           80.6
                                                                                                                                                                                                                                                                                                                                                                                                                            Terminal-Bench     88.0      84.3          -          82.7       83.4          -
                                                                                                                                                                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                                                                                                                                                                           Mythos 5  Fable 5  Mythos Prev  Opus 4.8  GPT-5.5  Gemini 3.1 Pro
                                                                                                                                                                                                                                                                                                                                                                                                                            SWE-bench Pro                    80.3       80        77.8       69.2      58.6        54.2
                                                                                                                                                                                                                                                                                                                                                                                                                            SWE-bench Ver                    95.5       95        93.9       88.6       -          80.6
                                                                                                                                                                                                                                                                                                                                                                                                                            Terminal-Bench                   88.0      84.3        -         82.7      83.4         -
                                                                                                                                                                                                                                                                                                                                                                                                                            BrowseComp (Single-Agent)        88.0       -         87.9       84.3      84.4        85.9
                                                                                                                                                                                                                                                                                                                                                                                                                            BrowseComp (Multi-Agent)         93.3       -          -         88.5       -           -
                                                                                                                                                                                                                                                                                                                                                                                                                            Humanity’s Last Exam (No tools)  59.0       -         56.8       49.8      41.4        44.4
                                                                                                                                                                                                                                                                                                                                                                                                                            Humanity’s Last Exam (Tools)     64.5       -         64.7       57.9      52.2        51.4
                                                                                                                                                                                                                                                                                                                                                                                                                            CharXiv Reasoning (No tools)     88.9       -         86.2       80.5       -           -
                                                                                                                                                                                                                                                                                                                                                                                                                            CharXiv Reasoning (Tools)        93.5       -         92.5       89.9       -           -
                                                                                                                                                                                                                                                                                                                                                                                                                            BioMystery Bench (Human)         83.9       -         82.6       80.4       -           -
                                                                                                                                                                                                                                                                                                                                                                                                                            BioMystery Bench (Hard)          46.1       -         29.6       40.0       -           -
                                                                                                                                                                                                                                                                                                                                                                                                                            OSWorld-Verified                 85.0      85.0       85.4       83.4      78.7        76.2*
                                                                                                                                                                                                                                                                                                                                                                                                                            CritPt                           28.6       -         20.9       27.1      17.7         -
                                                                                                                                                                                                                                                                                                                                                                                                                            ArxivMath                        78.5      68.7       71.8       71.5      64.0         -
                                                                                                                                                                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                                                                                                                                            * Note: 3.5 Flash scored 78.4 on OSWorld-Verified
                                                                                                                                                                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                                                                                                                                          [0] https://news.ycombinator.com/item?id=48312633

                                                                                                                                                                                                                                                                                                                                                                                                                          • JanSt

                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:50 PM

                                                                                                                                                                                                                                                                                                                                                                                                                            I just asked Fable to do a task that has nothing to do with cybersecurity or is dangerous at all but the defense kicked in and it switched to Opus... :(

                                                                                                                                                                                                                                                                                                                                                                                                                              • nu11ptr

                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:10 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                Not only that, but asking it to do a security vulnerability assessment of your own project is a very valid and important thing, and there is no way for it to know what is yours vs someone else's, so we just lose this capability?

                                                                                                                                                                                                                                                                                                                                                                                                                            • knivets

                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                              > Software engineering. During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.

                                                                                                                                                                                                                                                                                                                                                                                                                              How was it measured? How was the output of this magnitude verified over a period of couple of days?

                                                                                                                                                                                                                                                                                                                                                                                                                              • Karrot_Kream

                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:24 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                Seems like Fable is doing a lot better on SWE-Bench-Pro and FrontierCode than GPT-5.5. Given how most folks I talk to and people instead online keep mentioning that GPT-5.5 was better than Opus, I'm curious what the experience now is like.

                                                                                                                                                                                                                                                                                                                                                                                                                                • baalimago

                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:00 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                  I can't justify a pricetag like that when deepseek v4 pro is $0.003625/1M for cache hit, $0.435 for cache miss and $0.87 /1M tokens for output.

                                                                                                                                                                                                                                                                                                                                                                                                                                  For the token cost of explaining some task to Fable, deepseek v4 pro is able to solve the same task many times over.

                                                                                                                                                                                                                                                                                                                                                                                                                                  • Leary

                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                    Uploaded my code base and it forced switched to Opus 4.8 after thinking for 5 minutes even though I prompted it to not work on cybersecurity related things. Amazing.

                                                                                                                                                                                                                                                                                                                                                                                                                                    • BrokenCogs

                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:09 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                      That pelican better be super realistic, unreal engine 6 style graphics

                                                                                                                                                                                                                                                                                                                                                                                                                                      • msp26

                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                        >Pricing for both models is $10 per million input tokens and $50 per million output tokens.

                                                                                                                                                                                                                                                                                                                                                                                                                                          • ponyous

                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:38 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                            Basically double from Opus 4.8 IIRC

                                                                                                                                                                                                                                                                                                                                                                                                                                        • bluelightning2k

                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                          To hide the severity of the price increase, the plan is to move everyone right one model.

                                                                                                                                                                                                                                                                                                                                                                                                                                          Haiku = essentially phased out Sonnet = the Haiku use cases Opus = the new Sonnet class Fable = the new Opus class

                                                                                                                                                                                                                                                                                                                                                                                                                                          If I am right, the other "5.0" models will be conspicuously absent, possibly even for a couple of months. (If Opus 5 follows soon and is even modestly better than 4.8 then I was wrong.)

                                                                                                                                                                                                                                                                                                                                                                                                                                          • swalsh

                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 6:27 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                            Tried asking some questions about a drug pipeline, and kept hitting the safety filters moving me to opus 4.8

                                                                                                                                                                                                                                                                                                                                                                                                                                            • jsw97

                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:24 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                              On my very first Fable 5 prompt, got flagged on a hard but completely uncontroversial option math problem, many tokens in. Although it's pretty clear that this is an unremarkable experience at this point.

                                                                                                                                                                                                                                                                                                                                                                                                                                              • balverineorder

                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                I have been refactoring a project using Opus 4.7/4.8 for the past few weeks or so. I just decided to switch to Fable 5 max today. It stopped half way through and it just blocked me and switched back to Opus 4.8 automatically. "This model has specific safety measures that flagged something in this message. This sometimes happens with safe, normal conversations. Send feedback or learn more." It would not identify what the problem was. I left feedback saying that their heuristics are too sensitive. For now I will not be using Fable 5.

                                                                                                                                                                                                                                                                                                                                                                                                                                                [0] https://support.claude.com/en/articles/15363606-why-claude-s...

                                                                                                                                                                                                                                                                                                                                                                                                                                                • bilsbie

                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:43 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                  Anyone else have it refuse to answer and switch to 4.8? It won’t let me ask questions about my genetics.

                                                                                                                                                                                                                                                                                                                                                                                                                                                  Edit. It just refused an investing question too. Not sure what’s going on.

                                                                                                                                                                                                                                                                                                                                                                                                                                                  • samename

                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:35 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                    > A new data retention policy

                                                                                                                                                                                                                                                                                                                                                                                                                                                    > Finally, we’re making a change to the way we handle business customer data for Fable 5, Mythos 5, and future models with similar or higher capability levels. We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.

                                                                                                                                                                                                                                                                                                                                                                                                                                                    • nine_k

                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                      /* What will happen first?

                                                                                                                                                                                                                                                                                                                                                                                                                                                      * Anthropic runs out of genre names.

                                                                                                                                                                                                                                                                                                                                                                                                                                                      * Anthropic changes the model naming convention.

                                                                                                                                                                                                                                                                                                                                                                                                                                                      * AGI is achieved and handles its own naming.

                                                                                                                                                                                                                                                                                                                                                                                                                                                      */

                                                                                                                                                                                                                                                                                                                                                                                                                                                        • hootz

                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:25 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                          >Opus is too small, increase the impact of the name.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          Okay, how about Mythos?

                                                                                                                                                                                                                                                                                                                                                                                                                                                          >Increase it even more.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          Right, then Cosmos.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          >Even more!

                                                                                                                                                                                                                                                                                                                                                                                                                                                          Even more? Let's try Aeon.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          >MORE, EVEN BIGGER

                                                                                                                                                                                                                                                                                                                                                                                                                                                          ALRIGHT, TRY OMEGAPANTHEON 7.8 THEN

                                                                                                                                                                                                                                                                                                                                                                                                                                                            • PeterStuer

                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:49 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                              Fable 5 Super

                                                                                                                                                                                                                                                                                                                                                                                                                                                              Fable 5 Ti

                                                                                                                                                                                                                                                                                                                                                                                                                                                      • bonsai_spool

                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:45 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                        Very straightforward biology work is getting blocked (these are things that relate to neuronal development and inherited seizure disorders). These are things I was working on using Opus just earlier today

                                                                                                                                                                                                                                                                                                                                                                                                                                                        • throwaway2027

                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:40 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                          E-mail from Anthropic Team:

                                                                                                                                                                                                                                                                                                                                                                                                                                                          Hello,

                                                                                                                                                                                                                                                                                                                                                                                                                                                          We're writing to inform you about some updates to our Privacy Policy.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          These changes only affect consumer accounts (Claude Free, Pro, and Max plans). If you use Claude Team, Claude Enterprise, the Claude Platform, or other services under our Commercial Terms or other agreements, then these changes don't apply to you. What's changing?

                                                                                                                                                                                                                                                                                                                                                                                                                                                          Claude can do more than ever — taking on bigger tasks and connecting with the apps you use. We've updated our Privacy Policy to be clearer about the data we collect and how we use it. We encourage you to read the updated Privacy Policy in full, but we’ve set out a summary of the key changes below:

                                                                                                                                                                                                                                                                                                                                                                                                                                                          1. Multi-step tasks and connected apps. As Claude takes on more multi-step tasks and works with third-party apps and services, we've explained the data this involves — including how data can flow to and from third parties when you connect a service or have Claude do tasks on your behalf.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          2. Verification data. As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          3. Study participation. If you take part in Anthropic studies, surveys, or interviews, we've explained the information we collect.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          4. Additional information about our data practices. We’ve provided more detail about how we communicate with you and promote our services, including providing tailored recommendations about our services that may be of interest to you. We've also clarified the circumstances under which we may receive or provide data to third parties, and the legal bases we rely on when processing your data.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          While our products have evolved, our commitments haven't: We don’t sell your data, Claude remains ad-free, and you can control whether your chats and coding sessions are used to train and improve Anthropic’s AI models. Learn more

                                                                                                                                                                                                                                                                                                                                                                                                                                                          For detailed information about these changes:

                                                                                                                                                                                                                                                                                                                                                                                                                                                              Review the updated Privacy Policy
                                                                                                                                                                                                                                                                                                                                                                                                                                                              Visit our Privacy Center for more information about our practices
                                                                                                                                                                                                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                                                                                                                                                                          - The Anthropic Team

                                                                                                                                                                                                                                                                                                                                                                                                                                                          • jackschultz

                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:11 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                            > We expect demand for Fable 5 to be very high, and difficult to predict. On the Claude API and consumption-based Enterprise plans, Fable 5 is fully available from today. For subscription plans, we’d rather give access sooner than later, so we’re rolling out more conservatively, in stages:

                                                                                                                                                                                                                                                                                                                                                                                                                                                            > - From today through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost. > - On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits. If capacity allows, we’ll extend the included window. > - After this point—when sufficient capacity allows us to do so—we aim to restore Fable 5 as a standard part of subscription plans. We intend to do this as quickly as we can.

                                                                                                                                                                                                                                                                                                                                                                                                                                                            I really wonder what their compute layout is for this. My guess from my understanding is that they know how to restrict during peak times and are willing to do this. Meaning we expect not the most fast responses and they can delay the inference to not have the service be down. Then, if that delay time is too annoying for token payers, they're saying they should be allowed to remove cost by taking away the subscription users.

                                                                                                                                                                                                                                                                                                                                                                                                                                                              • KennyBlanken

                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                Everything I've heard from people who have subscriptions is that they blow through their daily token quota sometimes in a matter of minutes, there's rate limiting, etc. They spend a lot of time just waiting to be able to use it. And they're paying through the nose for the privilege.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                It's all a scam.

                                                                                                                                                                                                                                                                                                                                                                                                                                                            • aizk

                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:21 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                              I'm calling that this will be a dud. Price will be too high, it'll just be a watered down version of mythos, and just look at the track record of Anthropic's last few releases.

                                                                                                                                                                                                                                                                                                                                                                                                                                                              • modeless

                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                Claude Fable 5 beats Pokémon FireRed using only vision: https://www.youtube.com/watch?v=CIQBP1w4B1M

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • uludag

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Any suggestion on how I should calibrate my cynicism towards this?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    I can immagine Anthropic running this experiment multiple times and picking the most impressive one. Or I could immagine like this entire run costing like $1000+ of tokens for this particular run. Or maybe they tried a bunch of Pokemon games and it couldn't even finish some of them. Or is it just able to do this because it has an immense amount of FireRed training data, and if you were to give it an "original" Pokemon game, where it actually had to navigate novel circumstances it would fail.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • svcphr

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Bold move putting in the lvl 3 Pidgey against Gary's Blastoise at the end there (~14sec in... integer timestamps insufficient here).

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • suddenlybananas

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:18 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Is there any more detail about this besides the very fast slideshow?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • modeless

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Seems like the harness was minimal with no extra game state or maps available. Apparently just the screen image. Seems like it took 50 hours in game time which according to Google is at the high end of a normal human playthrough. No idea how long it took in real time though.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • ex-aws-dude

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:34 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          I mean that’s AGI confirmed right?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • Tenoke

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:39 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        >they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Isn't (less than) 5% of sessions a lot? I was expecting a sub1% guarantee there, so this surprised me already.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • joshstrange

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:04 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          > Fable 5 is now consuming usage credits instead of your plan limits.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Literally have not used Claude Code at all today. I asked it to review the uncommitted code and in <8 minutes it used up my usage ($100/mo plan) and it doesn't reset for "4 hr 36 min". WTF. Oh, and it burned through $20 of extra usage before I could catch it and kill claude code (so I don't even get the output of all that work since it was still churning).

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Double the cost my ass, I use Opus heavily and it's never like this. I haven't hit a limit on the $100 more than once and that was under heavy load.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • ATMLOTTOBEER

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:30 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Same lol. I set it to fable + ultracode and it ate my limit in a single prompt

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • rightlane

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 6:11 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            My experiences so far have not been positive. The cyber security nerf is ridiculous. I am working on an AI based decompiler, every single interaction with Fable on my project has been flagged for cyber security.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Do they expect us to use this as a toy? Releasing a new more powerful model but not allowing normal use cases because the word "secure" showed up is a Dilbert comic, not a viable product.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • ibejoeb

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:13 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Ah, you're probably one to ask. They say "queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8." Are they transparent about when that happens, and is it priced at the rate of the underlying model?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • stronglikedan

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:19 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Careful using this with Cursor, especially for corp use. Anthropic will "retain agent request and output data associated with this model, regardless of you Cursor Privacy Mode setting."

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • bluelightning2k

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Congratulations to Anthropic for solving safety on Mythos exactly when the SpaceX compute came online. Nice how that lined up for them.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • Hawkenfall

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:36 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  > To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  While I appreciate being conservative, ~5% at the scale Anthropic is operating at is too massive a number. Speaking from my own experience, the actual number is higher than that as well (working on pretty benign tasks such as porting an old open source game into a different language). Opus 4.8 itself even identifies the gaurd's false-positives when its sub-agents are being blocked.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • I_am_tiberius

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    I'm very suspicious as they sent out an "We're updating our Privacy Policy" email right before the launch. I fear they try to take advantage of their market position by doing things with user data no other company could do because they know users don't have another choice.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • atestu

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:37 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Prob related to this part of the blog post:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • w10-1

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:31 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          It's a specific change: For safety evaluation, Fable data will be retained for the initial period notwithstanding prior opt-out

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • ilaksh

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:34 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        I guess I have kind of a long system prompt, but anyway I just said "hi there" and it replied "What's up?" and that cost me 22 cents. :P

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Anyway we already knew this was going to be expensive.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • cautiouscat

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:30 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          In the automotive world we have benchmarks in HP/torque with the dyno. That’s expensive though, so many depend on their “butt dyno” to judge if their fresh new parts and tune made a difference.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          I’m curious how this will feel to my code “butt dyno”. I haven’t noticed much between Opus and Sonnet. I’m comparing this difference to the early days of Claude in 2025. It does what I need and both need a little bit of correction and whatnot. Benchmarks are nice, but I want to see how this feels. Looking forward to trying it later tonight.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • sunir

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:39 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I have a similar question.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I think most software projects have reached the point that the speed of capturing real information about what the winner's circle looks like, and therefore what the program should be, so many magnitudes slower than the amount of code that can be generated in the wrong direction.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I'd need to measure these new models on well understood but complex problems that are relatively easy to validate to get a sense if they are 'better'; on the other hand, the real impact in daily life may be marginal since generating code is not the biggest problem at the moment.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • impulser_

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Every model release is just proof that AGI will most likely only be for the rich. We are a few years into LLMs and majority of people are already getting priced out of intelligence from LLMs and these are no where near AGI.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • hootz

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:27 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                You are only priced out if you only care for SOTA right now and can't wait for the inevitable cheap model coming in 6 months. DeepSeek, Xiaomi and Moonshot are already really cheap and match frontier performance from 6 months ago.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • dyauspitr

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:54 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    But they’re artificially cheap. When will they be cheap while the company makes a profit.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • hootz

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 6:00 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        They are not artificially cheap, they are still cheap even when hosted by independent inference providers. Are all providers subsidizing their open-weight models?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • modeless

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  This is like looking at mainframe pricing in 1990 and concluding that PCs will only be for the rich. The price of each new level of capability is going to drop like crazy very quickly. It won't be that long before practically any consumer use case will be possible on models that are dirt cheap.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • weakfish

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:51 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      This premise is based around the assumption that Moore's law is still working, which it very much isn't [0]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      [0] https://cap.csail.mit.edu/death-moores-law-what-it-means-and...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • andrewmunsell

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:06 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Improvements in model performance aren't always strictly compute-constrained in a way that makes them reliant on Moore's Law. Open weight models-- in particular, from Chinese labs-- are optimizing model intelligence with less compute. They're "behind" frontier models by months, but as others have noted, it's possible to get Sonnet 4.5+ level performance at reduced cost, today, from open weight labs.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • dyauspitr

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:54 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Hardware manufacturing hasn’t caught up yet. Once it does, especially in China these token prices are going to drop hard.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • raoulj

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:04 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  On this thread and similar, I'm noticing that some strong opinions about $LLM_PROVIDER are coming from accounts without much post history. With so much on the line, and the way that HN can influence developer behavior, I wonder what ways we can responsibly consume opinions in a thread like this.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Not to cast too much criticism. HN is extremely well-moderated (thanks team!). But think we-developers need to be very wary.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • Karrot_Kream

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 6:15 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      I think the community on this site these days, much like other comment sections on the web, just read the headline and make a low effort comment. Regression to the mean I guess.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • recitedropper

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 6:12 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Do you see the pattern as new accounts tending to boost or criticis $LLM_PROVIDER? I think I see both...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Either way, I agree that HN is quickly becoming more manipulated and low SNR, like the rest of the entire internet.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • __alexs

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:35 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Asked it to review some of my own blood test results and it immediately turned itself off and went back to Opus. Pretty disappointing.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • gslepak

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:53 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Genius way to double the price on Opus 4.8!

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • bradleyg223

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:35 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          This is a very particular use case/test, but my first prompt on a new model is always "write a solo fingerstyle guitar tab that blends ragtime, bluegrass, and gypsy jazz". This is the first model that has responded with something that isn't just a boring arpeggio of chords, so from my perspective it's off to a good start.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • kypro

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:43 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Would you mind sharing?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • giancarlostoro

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:05 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Found this via Google:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • 2001zhaozhao

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:56 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              We'll need a lot of good summarization techniques to cut down on the cost of this model. I expect that a common use of Fable 5 is to just do high level direction while delegating literally all work (exploration and implementation) to Opus subagents.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              BTW for another discount opportunity, if you reload usage credits on a claude.ai plan at $1000 increments then you get a 30% discount compared to paying API.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • balverineorder

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                I have been refactoring a project using Opus 4.8 for the last week or so. I just decided to switch to Fable 5 max. It stopped half way through and it just blocked me and switched back to Opus 4.8 automatically. "This model has specific safety measures that flagged something in this message. This sometimes happens with safe, normal conversations. Send feedback or learn more." I left feedback saying that their heuristics are too sensitive. For now I will not be using Fable 5.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                [0] https://support.claude.com/en/articles/15363606-why-claude-s...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • theLiminator

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:07 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  > We have also added safeguards related to frontier LLM development. As discussed in Section 6.1 of our February 2026 Risk Report, we are concerned about the risks of accelerating the overall pace of AI development, though we remain uncertain about the severity of these risks. In particular, our concern is with—as we wrote then—“accelerating other AI developers in building powerful AI systems that pose similar risks to the ones ours pose - without necessarily having commensurate safeguards.” In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests. We’ll continue to improve the precision of our detection methods following the launch of this model.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  This seems pretty bullshit, you're paying through the nose for tokens and if you are doing anything ML-adjacent, you might silently get worse output without knowing it.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • killiancarroll

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:25 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    A large jump in performance for double the token cost compared to Opus 4.8. Potentially worth it for planning work, likely better to offload to a less expensive model when the hard decisions are made.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • erghjunk

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:54 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Nice branding.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    I wonder how much butterfly habitat has been/is being replaced with data centers?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • siliconc0w

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:42 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Sadly, I'm getting a lot of forced downgrades to Opus for questions that are far removed from any security topic.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • SandmanDP

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:45 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Literally within minutes of this announcement I was both charged for another month and had my subscription suspended due to the “charge being unsuccessful”. What kind of scam is Anthropic running here? I can’t even find a way to get in touch with their billing department to contest this

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • bradley13

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:00 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          I use AI for a wide variety of things, of which technical is only a small part - and then it's usually a problem with project configuration, not coding. Why? Because I am often testing projects handed in by students. Projects that supposedly work on their machine, but certainly do not on mine.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Anyway, anecdotally, I find Copilot shockingly awful. It makes random changes to files that have nothing to do with the problem. Call it out, and it makes other changes to other irrelevant files.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ChatGPT and Gemini are both much better. Grok also isn't bad. Claude, I honestly haven't tried yet on these issues. Perhaps I should...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • PeterStuer

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:36 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            If you are not seeing it under /model, do a /exit , then a Claude upgrade, then /model again and it should be there.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • brianmcnulty

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:09 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I wonder how Claude Fable will live up to expectations and how good those Fable/Mythos classifiers really are. It seems a bit convenient for Anthropic to release this magical insane model when they are about to IPO.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • yandie

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:21 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Of course it's all about building the hype for the IPO :)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • jwpapi

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:15 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Honestly all the recent improvements, just seem to be slower and more expensive traded for more accuracy, but the issue is that it needs to be exponentially more accurate to counter the effect of having less of a human in a loop.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Every wrong direction/mistake is more expensive and takes more time to fix. When you have small loops you can catch those mistakes faster and cheaper.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                To me we are very far off from economically given long-running tasks to agents.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • lkm0

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:11 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  I'm a bit out of the loop, but do we have some grasp on the size of these closed models? Is the trick still adding an order of magnitude to weights and training data or has something changed?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • m_w_

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      I think Mythos is rumored to be ~10T parameters, so in this case I think the answer is yes, although I'm sure MoE, looped models, etc play a role in the improvements as well.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • pookieinc

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    If this is as epic as it sounds, I wonder what the response will be from the other leading frontier labs / whether they even have anything to respond with at this level?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • ilaksh

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Look at the benchmarks. It's a big leap in some areas, but it's not like any of them are 60% better (if that could even make sense).

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • today at 5:33 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • dangoodmanUT

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 6:06 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Not comparing to GPT Pro models is a bit strange, considering that's the natural comparison

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • merlindru

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > During early testing, Stripe reported that Fable 5, [...] in a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        EDIT: I misread. This comment previously talked about 50 million lines being migrated. Instead, in a 50M LOC codebase, one specific codebase-wide migration was done.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Very impressive, but obviously not on the order of a whole-codebase migration

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • christina97

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:17 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            They do not claim to have migrated 50 million lines of Ruby. Simply that some migration took place in such a codebase.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • reddit_clone

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:01 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Converted all the tabs to spaces? :-)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                You are right, this is not a rewrite like the Bun case.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                The real news is, at 50M LOC, it is able to handle and do _something_ coherent.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • geodel

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Ok, so Stripe migrated their 50MLOC codebase from Ruby to Rust? Because that's what Bun did.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • darrinm

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 6:18 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Not supported in Claude Code yet?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • pmuk

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                From inside a claude code session:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                /model claude-fable-5

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Or start claude code with:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                claude --model claude-fable-5

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • darrinm

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 6:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Yeah, /model fable also worked for me (despite not being shown on the /model list). Thanks.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • Retr0id

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:41 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              The escalating nerfs of "cybersecurity" topics is incredibly frustrating. Opus 4.6 had boundaries that seemed reasonable to me but 4.7+ turned it into a moralizing asshole. It'd be less bad if it just gave an error message, but instead it churns a long thinking trace before writing an essay about why what you're asking is bad and wrong.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I'll be disappointed when 4.6 is retired.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • knollimar

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:11 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                I swear I read a joke that "what if we named chatgpt 5.5 Fable. Could we hype it as much as mythos?" Last week!

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • today at 5:21 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • yokoprime

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Probably great for those who need this. I could continue using opus 4.6 class models for the foreseeable future

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • BenoitEssiambre

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:29 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Looks like a good model (sir). Costs are getting out of control though. 2x Opus and non-metered usage going away. We're quickly approaching the cost of a human salary for normal usage.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • vb-8448

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:26 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          In a lot of places outside US we are already above the average cost of an average human.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • Overpower0416

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:15 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        I would expect a release from OpenAI soon. The battle for who can pump up their IPO the most

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • JustSkyfall

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:59 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Would be more impressive if the safeguards weren't so trigger-happy!

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • Ninjinka

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:56 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              gah could model naming be any more confusing?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              "Claude Fable 5: a Mythos-class model"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              "we're also launching Claude Mythos 5"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              what is the 5? how is mythos both a model category and a model name?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • fabled-out

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 6:27 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                This i

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • rfgplk

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:19 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  If the claimed capabilities are true, Fable 5 is already at a superhuman level. We might see genuine unprecedented leaps in technology now, across all fields.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • today at 5:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • gear54rus

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        yees, any second now!

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        the leap here is browser extensions appearing to block all mentions of ai across the web

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        and that's a good thing

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • himata4113

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 6:05 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > virtualization
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        switching to opus 4.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ok fair

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > embedded-allocator
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        switching to opus 4.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      urgh fine

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > chrome
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        switching to opus 4.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      are you kidding me?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • arkwin

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 6:10 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Just wanted to comment here: I have been using Opus 4.6, 4.7, and 4.8 just fine to look for Linux kernel vulnerabilities (I'm in the cyber verification program), and it's been fine. I switched to Claude Fable 5, and now I'm getting policy violations.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        What's the point of being in the cyber verification program at this point? It looks like I cannot use Fable 5 for vulnerability research.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • asdK120

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:31 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          In other words, Fable is Mythos with less compute and with some feel good "safeguards".

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          At least they name their models honestly now to indicate that the religion has nothing to do with reality. Soon the disciples will pay the full token price to fatten their church leaders.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • today at 5:07 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • taimurshasan

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:30 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I was on board until i saw " $50 per million output tokens" lost me bud

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • today at 6:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • hydra-f

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:12 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  How much and what kind of data do you need to throw at these models to get a good design interface?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • nevir

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:27 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    "Fable 5 (disabled) Most capable for your hardest and longest-running tasks · Disable zero data retention to unlock Fable 5 access"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • __lain__

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 6:19 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      It won't even run a basic /security-review command without reverting to Opus 4.8. Utterly useless.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • wslh

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        I am playing with it and keeps switching to Opus [1]. The chat is a basic security review of a business project.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        [1] "This model has specific safety measures that flagged something in this message. This sometimes happens with safe, normal conversations. Send feedback or learn more."

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • aykutseker

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:52 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            who's tried it: is 2x the usage actually worth it over Opus 4.8 for daily work?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • system2

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I have been using FABLE 5 with Claude Code since the morning. The speed is very close to what Opus 4.5 was, and the quota use is nearly identical to what it was before the "doubling". Whatever I was experiencing 4-5 months ago is back. Maybe the model is better, but we will see. I cannot tell the difference yet.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • kypro

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Out of interest, how have you been using it since this morning? Are you in some kind of pre-release group?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • segmondy

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:44 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Mythos, Fable, are they trolling us?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • IChooseY0u

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:42 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more: https://support.claude.com/en/articles/15363606 ⎿ Tip: You can configure model switch behavior in /config

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  biology? what the heck?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • alvis

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:20 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Another thing to note: 30-day retention for all traffic on Mythos-class models

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Is it good or bad? 30 days is a long time for anything bad to happen

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • geopsist

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:05 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      the post is live now https://www.anthropic.com/news/claude-fable-5-mythos-5

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • today at 5:19 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • xeyownt

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 6:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Anthropic, can you please stop the FUD?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Release your best model, let the world adapt and evolve, and let's move to the next thing.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • bradley13

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 6:03 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Can we please stop with the extreme "safeguards"? I don't want to waste processing power on a model deciding whether is can answer my question, or ensuring that it's answer is politically correct.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • 152334H

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:13 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              i wasn't even trying and i got flagged already...

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • pmuk

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:14 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Anyone got it working in claude code yet?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • pmuk

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:22 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    claude --model claude-fable-5

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    appears to work

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • Sathwickp

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:34 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  input price $10 per mil token and output price 50$ per mil token btw

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • jckahn

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:09 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Cannot wait for the pelican for this one

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • charcircuit

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:59 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      >During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Who is refactoring by hand? This comparison is not relevant in 2026.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • pablogancharov

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:23 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        you can select it using /model fable in claude desktop and claude-code

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • throwaway2027

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:13 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Will try it when my limit resets.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • bnchrch

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:11 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            An 11% jump over opus 4.8 and a 22% jump over gpt 5.5 on Agentic Coding Benchmarks is certainly impressive.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Obviously still need to verify it for myself to see if it's truely a leap.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            But am I the only one wondering, "What can I do today that I couldnt do yesterday?"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Previously I would think "Oh I wonder if I can finally get it to do X now?"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            However now I feel like yesterdays models were more that capable to handle nearly any engineering task I paired with it on.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Maybe this is the final leap where I can comfortable set up an autonomous coding loop? Maybe.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • yaodub

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:28 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                [dead]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • firemelt

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:48 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              they are like drugs dealer

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • today at 5:42 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • today at 5:13 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • today at 5:06 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • w4yai

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:10 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Pelican guy ! Where are you ? :)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • bitpush

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 4:59 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        404?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • catigula

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        today at 5:40 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        >The capabilities of models like Fable 5 and Mythos 5 have the potential to do profound good for the world

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Huh? We've seen nothing but wall to wall predictions that these models are going to take all of our jobs and kill us.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        What's the value add here?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • byteoptimizer

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:12 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Is Claude Fable 5 is Mythos ?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • tekla

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:07 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Maybe at this point, Fable the game will be played generated by AI as we go.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • CoderAshton

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:21 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              [dead]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • Stevvo

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                [dead]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • hmokiguess

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:39 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  I have got it to one shot GTA 6 we can finally play it, it only took ultracode make no mistakes (/s)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • mugivarra69

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:33 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    [dead]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • solenoid0937

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:34 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      [flagged]

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • weakfish

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          today at 5:45 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          From the rules [0]:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          > Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          [0] https://news.ycombinator.com/newsguidelines.html

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • javawizard

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 5:59 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              They didn't say that HN is turning into Reddit, they said that the conversation quality has gone to shit.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I don't agree with that statement universally, but I have to say I do when it comes to this article. I came here hoping for substantive discussion from those who'd had a chance to try it out; instead what I got was a seemingly endless stream of venting. There's a place for venting - and plenty to vent about with the state of AI nowadays - but to borrow from the HN guidelines you linked, it does very little to gratify my personal intellectual curiosity.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • 10xDev

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:42 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Nothing here is new, it is the thing we have been talking about for a while but now with guardrails.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • tripleee

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              today at 6:09 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Hate to break it to you but those "informed takes" were from people who prompted it once then made a snap judgement

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • Karrot_Kream

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 6:12 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  That is 1000x better than griping about the privacy policy, capacity issues, token costs, and how trendy the names are for the new models (???). The bar is on the floor and I just want it at my knees.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • bjord

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            today at 5:06 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            I thought they said mythos was too dangerous to make generally available?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • Philpax

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                "Releasing a model this capable comes with risks. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage. We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, we’re working to improve our safeguards and reduce false positives as quickly as we can.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                For a small group of cyberdefenders and infrastructure providers, we’re also launching Claude Mythos 5. It’s the same underlying model as Fable 5, but with the safeguards lifted in some areas.2 Mythos 5 will initially be deployed through Project Glasswing, in collaboration with the US Government, as an upgrade to Claude Mythos Preview. It has the strongest cybersecurity capabilities of any model in the world. Soon, we intend to expand access to Mythos 5 through a broader trusted access program."

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • dmix

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  today at 5:07 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  This is covered in their post…

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • tomeraberbach

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    today at 5:09 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    "Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage. We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8."

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • rvz

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      today at 5:08 PM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      You fell for their fearmongering and marketing fundraising call which was done on purpose.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Now they want to pause AI because of "recursive self improvement".

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Fool me once shame on you fool me twice...