
Watchtower
Watchtower tracks changes to corporate and government AI safety policies, both announced and unannounced. Click any entry for details.
< Back
Date:
Apr 7, 2026
Meta
Change
Major
Meta released version 2 of its AI safety policy, renaming it to “Advanced AI Scaling Framework.” This is the first major update since the initial version was published in February 2025.
The new policy is more or less a complete rewrite that, while retaining some structural and procedural similarities from the old policy, revamps each of the existing sections and adds substantial new detail throughout, bringing it more in line with competitors’ policies in terms of rigor and detail.
V1’s critical threshold required that a model "uniquely enable" a threat scenario — an extremely high bar. V2 keeps that standard for deployment where the risk cannot be mitigated, but adds a second, lower trigger, “substantially contribute to,” which applies to ongoing development and to deployment more broadly. The high threshold was similarly broadened.
V1's critical threshold was labeled "Stop development," and V2 relabels it to “Develop with mitigations," but the underlying language is similar; both versions require that risk be reduced to moderate levels before proceeding. The high threshold is similarly reframed from "Do not release" to "Deploy with mitigations," though again, the underlying requirements are comparable.
The new version also adds Loss of Control as a third risk domain alongside Cybersecurity and Chemical & Biological risks, names the Chief AI Officer and a new Director of Alignment and Risk as responsible decision-makers, adds whistleblower protections, commits to publishing preparedness reports and a model spec, and concretely defines what counts as “Frontier AI”. V2 also introduces a more granular deployment taxonomy (internal, limited, controlled, closed release, open release), with different mitigations applying to each.
A diff of the changes can be found below:
