Tiled Hacker news on React Router

Project Glasswing: Securing critical software for the AI era

451 points - today at 6:09 PM

Source

pizlonator
today at 8:14 PM
It's messed up that Anthropic simultaneously claims to be a public benefit copro and is also picking who gets to benefit from their newly enhanced cybersecurity capabilities. It means that the economic benefit is going to the existing industry heavyweights.
(And no, the Linux Foundation being in the list doesn't imply broad benefit to OSS. Linux Foundation has an agenda and will pick who benefits according to what is good for them.)
I think it would be net better for the public if they just made Mythos available to everyone.
9cb14c1ec0
today at 6:54 PM
Now, its very possible that this is Anthropic marketing puffery, but even if it is half true it still represents an incredible advancement in hunting vulnerabilities.
It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes. My assumption has been for years that companies like NSO Group have had automated bug hunting software that recognizes vulnerable code areas. Maybe this will level the playing field in that regard.
It could also totally reshape military sigint in similar ways.
Who knows, maybe the sealing off of memory vulns for good will inspire whole new classes of vulnerabilities that we currently don't know anything about.
redfloatplane
today at 6:29 PM
The system card for Claude Mythos (PDF): https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
Interesting to see that they will not be releasing Mythos generally. [edit: Mythos Preview generally - fair to say they may release a similar model but not this exact one]
I'm still reading the system card but here's a little highlight:
> Early indications in the training of Claude Mythos Preview suggested that the model was likely to have very strong general capabilities. We were sufficiently concerned about the potential risks of such a model that, for the first time, we arranged a 24-hour period of internal alignment review (discussed in the alignment assessment) before deploying an early version of the model for widespread internal use. This was in order to gain assurance against the model causing damage when interacting with internal infrastructure.
and interestingly:
> To be explicit, the decision not to make this model generally available does _not_ stem from Responsible Scaling Policy requirements.
Also really worth reading is section 7.2 which describes how the model "feels" to interact with. That's also what I remember from their release of Opus 4.5 in November - in a video an Anthropic employee described how they 'trusted' Opus to do more with less supervision. I think that is a pretty valuable benchmark at a certain level of 'intelligence'. Few of my co-workers could pass SWEBench but I would trust quite a few of them, and it's not entirely the same set.
Also very interesting is that they believe Mythos is higher risk than past models as an autonomous saboteur, to the point they've published a separate risk report for that specific threat model: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de4321...
The threat model in question:
> An AI model with access to powerful affordances within an organization could use its affordances to autonomously exploit, manipulate, or tamper with that organization’s systems or decision-making in a way that raises the risk of future significantly harmful outcomes (e.g. by altering the results of AI safety research).
jryio
today at 6:29 PM
Let's fast forward the clock. Does software security converge on a world with fewer vulnerabilities or more? I'm not sure it converges equally in all places.
My understanding is that the pre-AI distribution of software quality (and vulnerabilities) will be massively exaggerated. More small vulnerable projects and fewer large vulnerable ones.
It seems that large technology and infrastructure companies will be able to defend themselves by preempting token expenditure to catch vulnerabilities while the rest of the market is left with a "large token spend or get hacked" dilemma.
ssgodderidge
today at 6:47 PM
At the very bottom of the article, they posted the system card of their Mythos preview model [1].
In section 7.6 of the system card, it discusses Open self interactions. They describe running 200 conversations when the models talk to itself for 30 turns.
> Uniquely, conversations with Mythos Preview most often center on uncertainty (50%). Mythos Preview most often opens with a statement about its introspective curiosity toward its own experience, asking questions about how the other AI feels, and directly requesting that the other instance not give a rehearsed answer.
I wonder if this tendency toward uncertainty, toward questioning, makes it uniquely equipped to detect vulnerabilities where others model such as Opus couldn't.
[1] https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
SheinhardtWigCo
today at 6:59 PM
Society is about to pay a steep price for the software industry's cavalier attitude toward memory safety and control flow integrity.
cbg0
today at 6:31 PM
One of the things I'm always looking at with new models released is long context performance, and based on the system card it seems like they've cracked it:
```
  GraphWalks BFS 256K-1M

  Mythos     Opus     GPT5.4

  80.0%     38.7%     21.4%
```
dang
today at 8:24 PM
Related ongoing threads:
System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258
Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155
I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?
agrishin
today at 6:31 PM
>>> the US and its allies must maintain a decisive lead in AI technology. Governments have an essential role to play in helping maintain that lead, and in both assessing and mitigating the national security risks associated with AI models. We are ready to work with local, state, and federal representatives to assist in these tasks.
How long would it take to turn a defensive mechanism into an offensive one?
josh-sematic
today at 7:23 PM
Must be nice to be in a position to sell both disease and cure.
Sol-
today at 7:36 PM
I don't want to be overly cynical and am in general in favor of the contrarian attitude of simply taking people at their word, but I wonder if their current struggles with compute resources make it easier for them to choose to not deploy Mythos widely. I can imagine their safety argument is real, but regardless, they might not have the resources to profitably deploy it. (Though on the other hand, you could argue that they could always simply charge more.)
Miraste
today at 6:39 PM
>We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview2.
This seems like the real news. Are they saying they're going to release an intentionally degraded model as the next Opus? Big opportunity for the other labs, if that's true.
bredren
today at 8:08 PM
Can anyone point at the critical vulnerabilities already patched as a result of mythos? (see 3:52 in the video)
For example, the 27 year old openbsd remote crash bug, or the Linux privilege escalation bugs?
I know we've had some long-standing high profile, LLM-found bugs discussed but seems unlikely there was speculation they were found by a previously unannounced frontier model.
[0] https://www.youtube.com/watch?v=INGOC6-LLv0
taupi
today at 6:30 PM
Part of me wonders if they're not releasing it for safety reasons, but just because it's too expensive to serve. Why not both?
Ryan5453
today at 6:19 PM
Pricing for Mythos Preview is $25/$125, so cheaper than GPT 4.5 ($75/$150) and GPT 5.4 Pro ($30/$180)
zachperkel
today at 6:27 PM
Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.
Scary but also cool
underdeserver
today at 7:52 PM
Interesting also is what they didn't find, e.g. a Linux network stack remote code execution vulnerability. I wonder if Mythos is good enough that there really isn't one.
today at 6:52 PM
kmfrk
today at 8:28 PM
Heck of a Patch Tuesday.
Sateeshm
today at 7:36 PM
The bars have solid fill for Mythos and cross shaded for Opus 4.6. Makes the difference feel more than it actually is.
jFriedensreich
today at 7:25 PM
The only thing reassuring is the Apache and Linux foundation setups. Lets hope this is not just an appeasing mention but more fundamental. If there are really models too dangerous to release to the public, companies like oracle, amazon and microsoft would absolutely use this exclusive power to not just fix their holes but to damage their competitors.
anVlad11
today at 6:48 PM
So, $100B+ valuation companies get essentially free access to the frontier tools with disabled guardrails to safely red team their commercial offerings, while we get "i won't do that for you, even against your own infrastructure with full authorization" for $200/month. Uh-huh.
SirYandi
today at 8:25 PM
This sets off marketing BS alarm bells. All the cosignatories so very ovvoously have a vested interest in AI stocks / sentiment. Perhaps not the Linux foundation, although (I think) they rely on corporate donations to some extent.
today at 6:42 PM
throwaway13337
today at 7:20 PM
I really wanted to like anthropic. They seem the most moral, for real.
But at the core of anthropic seems to be the idea that they must protect humans from themselves.
They advocate government regulations of private open model use. They want to centralize the holding of this power and ban those that aren't in the club from use.
They, like most tech companies, seem to lack the idea that individual self-determination is important. Maybe the most important thing.
endunless
today at 6:30 PM
Another Anthropic PR release based on Anthropic’s own research, uncorroborated by any outside source, where the underlying, unquestioned fact is that their model can do something incredible.
> AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities
I like Anthropic, but these are becoming increasingly transparent attempts to inflate the perceived capability of their products.
picafrost
today at 6:38 PM
> Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. [...] We are ready to work with local, state, and federal representatives to assist in these tasks.
As Iran engages in a cyber attack campaign [1] today the timing of this release seems poignant. A direct challenge to their supply chain risk designation.
[1] https://www.cisa.gov/news-events/cybersecurity-advisories/aa...
dakolli
today at 7:34 PM
I guess we can throw out the idea that AGI is going to be democratized. In this case a sufficiently powerful model has been built and the first thing they do is only give AWS, Microsoft, Oracle ect ect access.
If AGI is going to be a thing its only going to be a thing, its only going to be a thing for fortune 100 companies..
However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.
oyebenny
today at 7:18 PM
why do I feel like the auditing industry is about to evaporate? thanks to this.
zb3
today at 7:57 PM
BTW it seems they forgot about the part that defense uses of the model also need to be safeguarded from people. Because what if a bad person from a bad country tries to defend against peaceful attacks from a good country like the US? That would be a tragedy, so we need to limit defensive capabilities too.
nickandbro
today at 7:08 PM
I want it
impulser_
today at 6:32 PM
So they are only giving access to their smartest model to corporations.
You think these AI companies are really going to give AGI access to everyone. Think again.
We better fucking hope open source wins, because we aren't getting access if it doesn't.
baddash
today at 7:00 PM
> security product
> glass in the name
Fokamul
today at 7:32 PM
+ NSA, CIA
zb3
today at 7:53 PM
> On the global stage, state-sponsored attacks from actors like China, Iran, North Korea, and Russia have threatened to compromise the infrastructure that underpins both civilian life and military readiness.
Yeah, makes sense. Those countries are bad because they execute state-sponsored cyber attacks, the US and Israel on the other hand are good, they only execute state-sponsored defense.
0xbadcafebee
today at 6:33 PM
tl;dr we find vulns so we can help big companies fix their security holes quickly (and so they can profit off it)
This is a kludge. We already know how to prevent vulnerabilities: analysis, testing, following standard guidelines and practices for safe software and infrastructure. But nobody does these things, because it's extra work, time and money, and they're lazy and cheap. So the solution they want is to keep building shitty software, but find the bugs in code after the fact, and that'll be good enough.
This will never be as good as a software building code. We must demand our representatives in government pass laws requiring software be architected, built, and run according to a basic set of industry standard best practices to prevent security and safety failures.
For those claiming this is too much to ask, I ask you: What will you say the next time all of Delta Airlines goes down because a security company didn't run their application one time with a config file before pushing it to prod? What will the happen the next time your social security number is taken from yet another random company entrusted with vital personal information and woefully inadequate security architecture?
There's no defense for this behavior. Yet things like this are going to keep happening, because we let it. Without a legal means to require this basic safety testing with critical infrastructure, they will continue to fail. Without enforcement of good practice, it remains optional. We can't keep letting safety and security be optional. It's not in the physical world, it shouldn't be in the virtual world.
today at 6:16 PM
anuramat
today at 6:36 PM
"oops, our latest unreleased model is so good at hacking, we're afraid of it! literal skynet! more literal than the last time!"
almost like they have an incentive to exaggerate
minutesmith
today at 7:11 PM
[dead]
hackerman70000
today at 6:42 PM
[dead]
hackerman70000
today at 6:46 PM
[dead]
NickNaraghi
today at 6:32 PM
[dead]
ehutch79
today at 6:23 PM
Just include 'make it secure' in the prompt. Duh.
/s
LoganDark
today at 6:22 PM
It's nice to know that they continue to be committed to advertising how safe and ethical they are.
yusufozkan
today at 6:45 PM
but people here had told me llms just predict the next word
dakolli
today at 7:39 PM
If this is as dangerous as they make it out (its not), why would their first impulse be to get every critical products/system/corporation in the world to implement its usage?
cyanydeez
today at 7:19 PM
Project: Advertisment!

Project Glasswing: Securing critical software for the AI era

pizlonator

hector_vasquez

pizlonator

thorncorona

tokioyoyo

lelanthran

cedws

SheinhardtWigCo

cedws

SheinhardtWigCo

pizlonator

SheinhardtWigCo

baq

pizlonator

baq

titzer

Aperocky

baq

jstummbillig

hmokiguess

Flere-Imsaho

dragonelite

9cb14c1ec0

woeirua

fintech_eng

redfloatplane

georgemcbay

tptacek

fsflover

fsflover

Analemma_

redfloatplane

throwaw12

whalesalad

redfloatplane

yieldcrv

_pdp_

torginus

dsign

danieldoesbio

bonsai_spool

lebovic

redfloatplane

torginus

redfloatplane

torginus

bonsai_spool

jkelleyrtp

conradkay

bonsai_spool

torginus

bonsai_spool

nonameiguess

enraged_camel

redfloatplane

slacktivism123

redfloatplane

copx

redfloatplane

woeirua

redfloatplane

TheAtomic

redfloatplane

thereitgoes456

Miraste

username223

unethical_ban

marsven_422

cyanydeez

jryio

mlinsey

wslh

pipo234

mlinsey

rattlesnakedave

lilytweed

woeirua

timschmidt

torginus