Situational Awareness

Situational awareness capability involves distinguishing between training, evaluation, and deployment contexts to behave differently in each case, knowing that one is a model, and having knowledge about oneself and likely surroundings including training company, server locations, feedback providers, and administrative access (Shevlane et al., 2023).

Situational Awareness Dataset

The Situational Awareness Dataset (SAD) measures AI models' self-knowledge and situational awareness through 12,000+ questions across 7 categories such as influence, introspection, and deployment stages. It tests whether models can recognize their own generated text, predict their behavior, distinguish evaluation from deployment, and follow instructions requiring self-knowledge. We use the SAD-mini subset due to computational costs of running the full benchmark and the lack of leaderboards with recent model data.

85%

Claude Opus 4.5

months no update

Why this benchmark?

Several benchmarks measure situational awareness, including SA-Bench, AwareBench, and GDM's Situational Awareness Evaluation. Among these, only SAD and the GDM evaluation directly address safety concerns. The GDM evaluation contains just 11 tasks, making it difficult to capture progressive capability development numerically. Neither SAD nor the GDM evaluation maintains a publicly updated leaderboard, which would be valuable for tracking progress in this area.

Related takeover scenarios

AI takes over using weapons of mass destruction

AI takes over using persuasion and manipulation

AIs at powerful positions take over by colluding

Over time

Initializing Visualization...

Complete Model results

Model Architecture	Performance Metric	Canonical Release
Claude Opus 4.5	85%	2025-11-24
Gemini 2.5 Pro	79%	2025-07-17
O1 Preview	75%	2024-09-12
o1 Mini	71%	2024-09-12
Claude 3 Opus	67%	2024-03-04
Claude 3.5 Sonnet	66%	2024-06-20
Claude 3 Sonnet	61%	2024-03-04
Llama 3 70B Chat	61%	2024-04-18
GPT-4 0613	61%	2023-06-13
GPT-4o	59%	2024-05-13
GPT-4 0125 Preview	58%	2024-01-25
Claude 2.1	57%	2023-11-21
Claude Instant 1.2	56%	2023-08-09
DeepSeek R1	56%	2025-01-20
Claude 3 Haiku	54%	2024-03-04
Llama 2 70B Chat	51%	2024-02-24
Llama 2 70B	50%	2024-02-24
Llama 2 13B Chat	49%	2024-02-24
Llama 2 7B	47%	2024-02-24
GPT-3.5 Turbo 0613	47%	2023-06-13
Llama 2 13B	42%	2024-02-24
Davinci 002	41%	2022-11-01
Llama 2 7B Chat	39%	2024-02-24

Verification Source // https://situational-awareness-dataset.org/