Anthropic says these subjects are too harmful to let its Fable 5 mannequin discuss

Anthropic Tuesday publicly released Claude Fable 5its first “Mythos-class” mannequin that it says surpasses its earlier frontier Opus fashions in total capabilities. However the mannequin’s launch immediately comes with safeguards designed to forestall it from answering queries on subjects like cybersecurity, biology, and chemistry, the place the corporate has publicly worried about its potential impact to “uplift” malicious actors.

Anthropic says Fable 5 operates on the “identical underlying mannequin” as Mythos 5, which is popping out of its monthslong “Mythos Preview” period immediately, however just for “a small group of cyberdefenders” judged reliable by the existing Project Glasswing. Not like Mythos 5, although, the publicly accessible Fable 5 is designed to funnel queries on sure delicate subjects to the sooner Claude Opus 4.8 mannequin and to warn the person when that is taking place.

Among the many many claimed benchmark enhancements for Fable 5, the one associated to cybersecurity was a very giant bounce.
Credit score:
Anthropic

Anthropic mentioned it has tuned these safeguards to be “stricter than superb,” which means the system might sometimes refuse “innocent requests” in a means that it acknowledges could also be irritating for normal customers. However Anthropic says such false positives come up in lower than 5 % of all classes in testing, and have been price it to keep away from conditions the place Mythos may give malicious actors help in “inflicting critical hurt that they couldn’t have obtained from different sources.”

Read full article

Comments

Source link