Futures

OpenAI’s Preparedness Team Ensures Safety of Frontier AI Models, from (20240225.)

External link

Summary

The Preparedness team at OpenAI is dedicated to ensuring the safety of frontier AI models. They work alongside safety and policy teams to mitigate risks from AI. The team focuses on preventing misuse of current models and products and also builds foundations for the safety of superintelligent models in the future. The team maps out emerging risks and collaborates with other safety teams. OpenAI invests in capability evaluations and data-driven predictions to detect and address emerging risks. The company emphasizes a builder’s mindset to safety, learning from real-world deployment to mitigate risks. The Preparedness Framework, currently in beta, outlines their approach to developing and deploying frontier models safely. It includes running evaluations, defining risk thresholds, establishing dedicated teams for oversight, developing safety protocols, and collaborating with external parties to reduce known and unknown risks. OpenAI acknowledges the evolving nature of AI safety and plans to update the framework regularly based on feedback and emerging knowledge.

Keywords

Themes

Signals

Signal Change 10y horizon Driving force
Dedicated Preparedness team for AI safety Increased focus on AI safety Robust safety measures and policies in place for advanced AI models Mitigating risks and ensuring safe AI deployment
Science-based and data-driven risk assessment Moving from hypothetical to concrete Risk assessment based on rigorous measurements and data-driven predictions More accurate and informed risk assessment
Applying builder’s mindset to safety Iterative deployment and learning Continuous learning and improvement of safety measures through real-world deployment Adapting safety measures to keep pace with innovation
Preparedness Framework (Beta) for safe deployment Structured approach to model evaluation Enhanced evaluation and risk assessment processes with detailed reports and risk “scorecards” Ensuring the safety of frontier AI models
Developing risk thresholds for safety measures Defining risk levels and mitigation Clear risk thresholds and safety measures based on post-mitigation scores of models Ensuring safe deployment of AI models
Technical team and Safety Advisory Group Technical oversight and decision-making Dedicated teams overseeing technical work, evaluating reports, and facilitating safety decision-making Informed and well-structured safety processes
Protocols for safety and external accountability Regular safety drills and audits Regular safety drills, audits by independent third parties, and external feedback on safety measures Ensuring external accountability and validation
Collaboration to address known/unknown risks Tracking misuse and emergent risks Collaborative efforts to track real-world misuse, emergent misalignment risks, and unknown risks Mitigating known/unknown safety risks
Living document for continuous improvement Regular updates and feedback Dynamic and evolving framework continuously updated based on learning and feedback Continuous enhancement of AI safety practices

Closest