Goal: existential security (x-risk from AI is negligible indefinitely or long enough for humanity to plan its future)
Defines ‘theory of victory’, which I think is similar to what we want from a strategy (although maybe less action-oriented?)
We can call [reaching existential security] an AI governance endgame. A positive framing for AI governance (that is, achieving a certain endgame) can provide greater strategic clarity and coherence than a negative framing (or, avoiding certain outcomes).
A theory of victory for AI governance combines an endgame with a plausible and prescriptive strategy to achieve it. It should also be robust across a range of future scenarios, given uncertainty about key strategic parameters.
Three theories of victory:
Frames AI safety as positive challenge to bring world into a state of existential security, rather than a negative challenge of minimizing many x-risks. Claims:
What we want from an endgame:
What we want from a strategy:
What we want from a theory of victory:
What we want from interventions:
We also might want to consider other x-risks, e.g. biosafety, nuclear security. If AI can help with these, we’d want to pick those paths.
Nukes
[notes]