Somebody Constructed an Open-Supply 'Theoretical Mythos' to Reverse-Engineer Anthropic's Most Harmful AI

In short

OpenMythos is a from-scratch reconstruction of the Claude Mythos structure, constructed solely from public analysis papers and educated guesses.
Claude Mythos is Anthropic’s strongest mannequin, locked away in Undertaking Glasswing as a result of it autonomously discovered 271 Firefox vulnerabilities and 32-step community assaults.
The repo is theoretical scaffolding—code with out skilled weights. It mirrors a separate effort by Vidoc Safety that reproduced Mythos’s vulnerability findings utilizing off-the-shelf fashions.

If Anthropic will not present you what is inside its most harmful AI, any person on GitHub will guess.

A developer named Kye Gomez has revealed OpenMythosan open-source reconstruction of what he thinks Claude Mythos appears like below the hood. The repo has picked up over 10,000 GitHub stars in just a few weeks upon launch, and ships with an exhaustive “readme” file stuffed with equations, citations, and a well mannered disclaimer that it has nothing to do with Anthropic.

It is hypothesis. Nevertheless it’s structured hypothesis, in code.

Right here’s a fast refresher on what Mythos is: Mythos leaked into public view in late Marchwhen Anthropic by chance revealed draft supplies describing it as the corporate’s most succesful mannequin so far—a tier above Opus. The follow-up, Mythos Preview, turned out to be unreleasably good at cybersecurity.

Per Anthropic, Mythos discovered 271 vulnerabilities in Firefox throughout Mozilla testing. It turned the primary AI mannequin to finish a 32-step company community assault simulation. Anthropic locked it inside Undertaking Glasswing, a vetted coalition of about 40 companions, together with Microsoft, Apple, Amazon, and the NSA.

The general public by no means will get to the touch it. So Gomez tried to determine the way it works.

OpenMythos’s central guess is that Mythos is a Recurrent-Depth Transformer—additionally referred to as a looped transformer. Normal fashions stack tons of of distinctive layers. Looped fashions take a smaller stack and run it by itself many occasions per ahead move.

In different phrases, it’s the identical weights going by extra iterations. Deeper pondering, in steady latent area, earlier than any token will get emitted.

The repo argues this could clarify Mythos’s two strangest qualities: It causes by novel issues no different mannequin can crack, however its uncooked memorization is uneven. That is the architectural fingerprint of looping—composition over storage.

OpenMythos cites Parcae, an April 2026 paper from College of California San Diego and Collectively AI that solved the long-standing instability downside in looped fashions—a 770 million-parameter Parcae mannequin matches a 1.3 billion fixed-depth transformer on high quality, with predictable scaling legal guidelines for what number of loops to run. The repo additionally borrows DeepSeek’s Multi-Latent Consideration to compress reminiscence, and a Combination-of-Specialists setup to deal with breadth throughout domains.

What it doesn’t have is weights, so mainly it’s a method with out an executor.

OpenMythos is theoretical. The code defines mannequin variants from 1 billion to 1 trillion parameters, however it’s a must to prepare them your self—the readme file factors to a 3 billion parameter coaching script on FineWeb-Edu and a Chinchilla-adjusted 30 billion-token goal, which is the type of compute invoice that runs into tons of of 1000’s of {dollars} on H100s. No one’s performed it but.

So why does it matter?

As a result of it is the second time in a month any person has chipped on the wall round Mythos. The primary was a examine from Vidoc Safety, which reproduced a number of of Mythos’s most alarming vulnerability findings utilizing GPT-5.4 and Claude Opus 4.6 inside an open-source agent. No Glasswing entry, and at below $30 per scan. Completely different angle, similar conclusion: The moat round Mythos could also be thinner than the advertising and marketing recommended.

OpenMythos and the Vidoc replication are doing completely different jobs. Vidoc reproduced Mythos’s outputs—the vulnerability discoveries themselves—utilizing present fashions. OpenMythos is attempting to breed the structure—the precise machine that produces these outputs. One says you do not want Mythos to search out the bugs Mythos discovered. The opposite says, finally, you may have the ability to construct one thing like Mythos your self.

Anthropic nearly actually would not share Gomez’s architectural guesses publicly, and a number of other of the design selections in OpenMythos are specific hedges—the readme file makes certain to be imprecise sufficient so customers know that is simply an strategy. It repeatedly says “doubtless,” “suspected,” and “nearly actually.” Actual Mythos is probably not a looped transformer in any respect. Or it is perhaps one with particulars Gomez hasn’t reverse-engineered but.

What OpenMythos demonstrates is that the analysis literature already incorporates a lot of the items. Looped transformers, Combination of Specialists, Multi-Latent Consideration, Adaptive Computation Time, Parcae’s stability repair—none of it’s proprietary. The repo is, greater than something, a list of what is publicly recognized about how you can construct a Mythos-class mannequin.

The repo is licensed MIT, and it has 2,700 forks already. The coaching script is sitting there, ready for somebody with a GPU cluster and a thesis to show.

Day by day Debrief Publication

Begin each day with the highest information tales proper now, plus unique options, a podcast, movies and extra.

Source link

Login

Register

In short

Day by day Debrief Publication

Related posts