At Cohere, our mission is to build machines that understand the world, and to make them safely accessible to all.
Natural Language Processing • Machine Learning • Artificial Intelligence
Yesterday
🔄 Hybrid – London
At Cohere, our mission is to build machines that understand the world, and to make them safely accessible to all.
Natural Language Processing • Machine Learning • Artificial Intelligence
•Join a team responsible for building infrastructure and compute platform at Cohere •Focus on stability, observability, and scalability •Participate in 24x7 on-call rotation, compensated for on-call schedule •Remote-friendly environment across EMEA region
•5+ years of engineering experience running production infrastructure at a large scale •Experience designing large, highly available distributed systems with Kubernetes, and GPU workloads on those clusters •Experience working with GCP, Azure, AWS and/or OCI •Experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments •Excellent collaboration and troubleshooting skills to build mission-critical systems, and ensure smooth operations and efficient teamwork •The grit and adaptability to solve complex technical challenges that evolve day to day •You worked with or supported MLEs or data scientists •Familiarity troubleshooting RDMA networking
Apply Now