Watchtower tracks changes to corporate and government AI safety policies, both announced and unannounced. Click any entry for details.

Nov 18, 2024

Anthropic

Slight

On November 18, Anthropic released a new page entitled "Tracking Voluntary Commitments." In it, they provide details about how they are complying with many voluntary safety and security frameworks. They classify these practices into eight risk categories:

  • Risk Assessment and Mitigation

  • Security & Privacy

  • Public Awareness

  • Societal Impact

  • Trust and Safety Commitments

  • Image-Based Sexual Abuse

  • Election Integrity

  • Terrorist and Violent Extremist Content

The tracking page can be found here.