< Back

Watchtower
Watchtower tracks changes to corporate and government AI safety policies, both announced and unannounced. Click any entry for details.
Date:
Anthropic
Moderate
Released v2.1 of their Responsible Scaling Policy. Key changes include:
Added and clarified new capability thresholds and safeguards for CBRN and AI R&D risks
Removed commitment to “define ASL-N+ 1 evaluations by the time we develop ASL-N models”
Confusingly, AI R&D thresholds which were implicitly the next capability level, ASL-3, are now labeled as “4.” This threshold is still associated with ASL-3 security mitigations.
A diff of the changes can be found below: