Job spec: https://bluedot.org/ai-safety-strategist/
Contact: Adam Jones ([email protected])
This document sets out some of the context for the work test for the AI safety strategist role.
You don’t need to read all the linked documents - they’re just here for extra context.
We teach AI safety, but we (and the whole field) don't really have a solid plan for making sure transformative AI goes well (see Summaries of AI safety plans). This makes it hard to answer questions like ‘Will mech interp actually be useful for making AI safe?’ or ‘How relevant will the global south be to catastrophic risk reduction?’.
We will fix this by making a real, concrete plan - both to improve our courses and to help lead the field in the right direction. This will meet our Criteria for an AI safety plan.
We considered a number of ways to build this plan (see How to construct a good AI safety plan). Our current approach is to build the strategy bottom up from AI risk scenarios. This involves getting a set of AI risks, and then for each risk:
Finally, we’ll look at all the interventions and pick a small and tractable set that would handle all the major risks. This will be the set of what needs to be done to keep TAI safe.
This way, we’ll know what to teach in our courses, and we’ll have a real plan we can show to the rest of the field.
After submitting the work test start form, you’ll get an automated email with the private details of the timed work test. The task will involve writing up some notes on a part of our AI safety strategy. At the end of the work test you’ll be expected to submit these notes.
Once you submit your work test, you can submit a payment claim to be reimbursed for the time spent on the work test. We’ll try to get you paid within 2 weeks of your request, although it might take up to 1 month due to the winter holidays.