Senior System Software Engineer - Cloud Infrastructure

October 11

Apply Now
Logo of NVIDIA

NVIDIA

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+

Description

• Design, prototype, implement and help operate the next generation of software to automate global cloud infrastructure for NVIDIA GPU-accelerated applications • Actively participate in systems design, code reviews, test authoring, feature development, bug triage, automation, configuration, documentation, and bug fixes • Benchmark, evaluate, and optimize the performance, reliability, and efficiency of network and storage subsystems and applications • Lead and participate in PoC and development efforts for various application use cases

Requirements

• BS or MS in Computer Science or Computer Engineering (or equivalent experience) • 8+ years of professional experience in software engineering, devops, and/or site reliability engineering • Excellent problem solving, collaborative, and interpersonal skills • Outstanding communication and soft skills, able to present to senior management in a sensible and persuasive manner • Ability to influence and build relationships with other software teams and functional groups • Exceptional knowledge and experience designing and writing concurrent code for large-scale and performance-optimized distributed systems • Experience integrating network, storage, and compute technologies with virtual machine and container orchestration systems • A security-first approach with a desire to deliver highly reliable, high-quality products • Ability to root-cause functional and performance issues in distributed systems – and drive issues to closure • Expert-level Linux systems configuration, administration, automation, debugging, and performance optimization (ex. RHEL, CentOS, Ubuntu, Rocky Linux) • Production experience with git ops and devops workflows and tooling such as FluxCD, ArgoCD, Helm Charts, Terraform and/or Ansible • Prior experience running Kubernetes clusters in production • Proven skills in modern container networking and storage architecture • Experience working in distributed teams across multiple time zones • Proficiency with Go (Golang) and Python

Benefits

• Competitive salary package • Equity • Benefits

Apply Now

Similar Jobs

October 11

Build a platform for managing and deploying physical sensors for defense.

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com