HPC Solutions Engineer

May 23

Apply Now
Logo of Hydra Host

Hydra Host

resilience β€’ bare metal β€’ data centers β€’ security β€’ provisioning

11 - 50

πŸ’° $10M Seed Round on 2022-04

Description

β€’ Report directly to the Co-Founder & CTO and collaborate with team members. β€’ Responsible for bespoke scoping and execution of high end compute resource orchestration. β€’ Configure and maintain large scale GPU clusters for premium clients. β€’ Install, configure, and monitor dependencies for customers. β€’ Work with teams and workloads to leverage large scale clusters. β€’ Conduct performance tuning and optimization of systems.

Requirements

β€’ 7+ years of high performance compute, distributed machine learning, GPU computing, and/or system architecture experience. β€’ Proficient in managing NVIDIA GPU environments and familiarity with GPU computing frameworks and libraries. β€’ Strong experience with high-speed networking technologies, specifically InfiniBand. β€’ Experience with HPC job schedulers, preferably SLURM. β€’ Expertise in automating environment setup and maintenance using Ansible and Terraform. β€’ Demonstrated ability in deploying and managing virtual storage solutions. β€’ Strong coding skills in Python and familiarity with machine learning libraries and frameworks. β€’ Excellent problem-solving, communication, and teamwork skills.

Benefits

β€’ Work with diverse hardware configurations and cutting-edge GPU use cases. β€’ Fully remote with a high accountability and high agency culture. β€’ Competitive salary, equity, and benefits. β€’ Flexible PTO.

Apply Now

Similar Jobs

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com