Senior Site Reliability Engineer

5 days ago

🇺🇸 United States – Remote

💵 $148k - $276k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Apply Now
Logo of NVIDIA

NVIDIA

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+

Description

• Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus on performance at scale, real time monitoring, logging and alerting. • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation and refinement. • Support services before they go live through activities such as system design consulting, developing software tools, platforms and frameworks, capacity management and launch reviews. • Maintain services once they are live by measuring and monitoring availability, latency and overall system health. • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity. • Practice sustainable incident response and blameless postmortems. • Be part of an on call rotation to support production systems.

Requirements

• BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent experience. • 5+ years of experience. • Experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large scale private or public cloud system in Production. • Experience in one or more of the following: Python, Go, Perl or Ruby. • In depth knowledge on Linux, Networking and Containers.

Benefits

• Eligible for equity and benefits.

Apply Now

Similar Jobs

5 days ago

DevOps Engineer at Capio Group providing Salesforce support and enhancements.

🇺🇸 United States – Remote

💵 $115k - $125k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

5 days ago

Aya Healthcare

5001 - 10000

Design and maintain Azure Cloud infrastructure for a healthcare solutions provider.

🇺🇸 United States – Remote

💵 $190k - $205k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

5 days ago

Movable Ink

501 - 1000

Site Reliability Engineer for a scalable content personalization platform serving billions of requests.

🇺🇸 United States – Remote

💵 $165k - $195k / year

💰 $55M Series D on 2022-04

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

5 days ago

Anza

11 - 50

Maintain large-scale systems and automate deployments at a blockchain technology company.

🇺🇸 United States – Remote

💵 $150k - $275k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com