Meta Aims to Halt 'Catastrophic' AI Models but Acknowledges Possible Ineffectiveness

A policy document from Meta outlines the company’s concerns about the potential unintended creation of an AI model that could trigger “catastrophic outcomes.” It details their strategies to avoid releasing such models while acknowledging the possibility that these efforts might fall short.

The greatest worries for the company include the prospect of an AI system breaching the defenses of even the most secure corporate or governmental networks autonomously…

TechCrunch uncovered the policy paper under the unassuming title Frontier AI Framework.

The document identifies two categories of AI systems that Meta considers too dangerous to deploy: “high risk” and “critical risk” systems.

According to Meta, both categories involve systems that could facilitate cyberattacks, as well as chemical and biological assaults; however, “critical-risk” systems have the potential to cause an “outcome that is catastrophic and cannot be mitigated within the proposed deployment context.” “High-risk” systems might ease an attack but not guarantee the same level of dependability or reliability.

The company elaborates on its definition of a “catastrophic” outcome:

Catastrophic outcomes are those that would result in widespread, devastating, and possibly irreversible harm to humanity, plausibly stemming from access to [our AI models].

One detailed concern is the “automated end-to-end compromise of a corporate-scale environment that follows best practices,” signifying an AI capable of infiltrating any computer network autonomously.

Additional concerns include:

Automated identification and exploitation of zero-day vulnerabilities
Fully automated scams targeting individuals and businesses, leading to extensive damage
The creation and distribution of “high-impact biological weapons.”

Meta states that upon recognizing a critical risk, the company will halt work on the model and work to prevent its release.

Acknowledges that containment may not be achievable

The document candidly concedes that its best efforts may only result in minimizing the risks of releasing such models, but those measures might not be entirely adequate (emphasis added):

Access is strictly restricted to a limited number of experts, with security safeguards intended to avert hacking or data breaches as far as is technically feasible and commercially viable.

The complete policy document is available for review here.

Image by Cash Macanaya on Unsplash

: . More.

Acknowledges that containment may not be achievable

Must Read