Anthropic is an AI safety and research company working to build reliable, interpretable, and steerable AI systems.
June 3
🏢 In-office - London
Anthropic is an AI safety and research company working to build reliable, interpretable, and steerable AI systems.
• An opportunity to build and run elegant and thorough machine learning experiments • Contribute to exploratory experimental research on AI safety with a focus on risks • Collaborate with other teams including Interpretability, Fine-Tuning, and the Frontier Red Team • Test the robustness of safety techniques and run multi-agent reinforcement learning experiments • Build tools to evaluate the effectiveness of novel LLM-generated jailbreaks • Contribute ideas, figures, and writing to research papers, blog posts, and talks • Run experiments feeding into key AI safety efforts at Anthropic
• Have significant software, ML, or research engineering experience • Have some experience contributing to empirical AI research projects • Have some familiarity with technical AI safety research • Prefer fast-moving collaborative projects to extensive solo efforts • Pick up slack, even if it goes outside your job description • Care about the impacts of AI
• Optional equity donation matching. • Comprehensive health, dental, and vision insurance for you and your dependents. • 401(k) plan with 4% matching. • 22 weeks of paid parental leave. • Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more! • Stipends for education, home office improvements, commuting, and wellness. • Fertility benefits via Carrot. • Daily lunches and snacks in our office. • Relocation support for those moving to the Bay Area.
Apply Now