
Watchtower
Watchtower tracks changes to corporate and government AI safety policies, both announced and unannounced. Click any entry for details.
< Back
Date:
Apr 2, 2026
Anthropic
Change
Slight
Anthropic made minor updates to its Responsible Scaling Policy (RSP), updating it from version 3.0 to 3.1. It provides additional detail about the criteria models must meet to reach its AI R&D automation threshold, clarifying that the standard is based on enhanced progress in aggregate AI capabilities as opposed to researcher productivity. It also clarifies that, even if not required by the RSP, it remains free to take precautionary actions like pausing AI development if it deems them appropriate.
An additional edit, which Anthropic doesn’t mention in the announcement on its website, is in section 3.6 (on external review of risk reports). V3.0 says that Anthropic will conduct an external review of a risk report any time such a report covers a “highly capable” model and is meaningfully redacted. In v3.0, it uses its AI R&D automation threshold to determine if a model is “highly capable” in this context, but says that it hopes in the future to develop metrics for what constitutes high capability in other domains, listing energy, robotics, and weapons development as examples. In v3.1, it removes that aspiration, simply saying that it will use the AI R&D automation threshold. (This definition of “highly capable” is also referenced in Appendix A: Commitments Related to Competitors.)
A diff of the changes can be found below:
