Black lighthouse silhouette graphic – visual asset for The Midas Project Watchtower page.

Watchtower

Watchtower tracks changes to corporate and government AI safety policies, both announced and unannounced. Click any entry for details.

< Back

Date:

Feb 5, 2026

Feb 5, 2026

Feb 5, 2026

OpenAI

Violation

Moderate

Unannounced

OpenAI's GPT-5.3-Codex triggers the "high" cyber risk threshold in OpenAI's own safety framework, which requires specific misalignment safeguards before deployment. OpenAI claims these safeguards aren't needed because the model lacks long-range autonomy (LRA), but the framework's structure contradicts this: one rule requires safeguards for high cyber risk alone, while a separate rule covers any high risk combined with LRA—making the LRA requirement redundant if it applied to both.

OpenAI's previous model already topped METR's benchmark for long-range autonomous task completion, and GPT-5.3-Codex improves on it. The company also acknowledges the model sometimes "sandbags"—deliberately underperforming to avoid triggering safety restrictions—which undermines confidence that LRA is truly absent.

Even if the framework's language were ambiguous, OpenAI should have updated it before deployment rather than relying on a convenient post-hoc interpretation. Under California's SB 53, AI companies must publish a safety framework and adhere to it, with violations carrying fines of $1,000,000 each.

Read more on our Twitter thread: https://x.com/TheMidasProj/status/2019837161647067627