August 27
🔄 Hybrid – London
• Oversee the strategic design, system performance, resource allocation, configuration management and operational support for both our hardware and software systems • Solve and troubleshoot complex technical problems with a proactive approach to system optimization and issue resolution • Ensure HPC standard security practices and compliance • Maintain comprehensive documentation for infrastructure, configurations and procedures • Collaborate effectively across teams of engineers and researchers
• High-performance networking • GPUs in large scale distributed networks • Large scale distributed storage file systems and providers • Linux Kernel and OS • Virtualization and container architecture in cloud environments (Docker, Kubernetes, OpenStack…) • SLURM • Software, hardware and network failure troubleshooting • LLM training • AI/ML frameworks • GPU programming (CUDA) • Passionate • Self-directed • Low-ego • Team player
• Daily lunch vouchers • Contribution to a Gympass subscription • Monthly contribution to a mobility pass • Full health insurance for you and your family • Generous parental leave policy
Apply Now