This post is also available in: heעברית (Hebrew)

Earlier this week, ChatGPT owner OpenAI published its guidelines for assessing possible “catastrophic risks” of artificial intelligence in models. This announcement comes only one month after the company’s CEO Sam Altman was fired by the board and then re-hired days later after staff and investors rebelled.

The company’s statement, titled “Preparedness Framework,” reads: “We believe the scientific study of catastrophic risks from AI has fallen far short of where we need to be.”

According to Techxplore, this said framework is meant to help address this gap. A monitoring and evaluation team will focus on “frontier models” that are currently being developed (extremely high-capability models). The team will then individually assess the models and assign them a level of risk, from “low” to “critical,” in four main categories. Only models with a risk score of “medium” or below will be approved to be deployed.

The four risk categories are as follows:

  • The first category concerns cybersecurity and the model’s ability to carry out large-scale cyberattacks.
  • The second category will measure the software’s inclination to help create things that could be harmful to humans (a chemical mixture, an organism like a virus, or a nuclear weapon).
  • The third category concerns the model’s power of persuasion and the extent to which it is able to influence human behavior.
  • The fourth category concerns the model’s potential autonomy, and more specifically whether it can escape the control of its creators.

Then, once the risks have been identified, the team will submit it to OpenAI’s Safety Advisory Group- a new body whose task is to make appropriate recommendations either to the CEO or to a person appointed by him, who will then decide on the changes the model in question requires to reduce the risks.