Staff Site Reliability Engineer - AWS/EKS

August 13

🇺🇸 United States – Remote

💵 $148k - $204k / year

⏰ Full Time

🔴 Lead

👨🏻‍🔧 Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

Apply Now
Logo of SentinelOne

SentinelOne

Secure your enterprise with the autonomous cybersecurity platform. Endpoint. Cloud. Identity. XDR. Now.

next-generation endpoint protection • endpoint detection & response • threat and malware prevention • exploit prevention • cybersecurity

1001 - 5000

Description

• Support the stability, reliability, and scalability of SentinelOne’s distributed systems through various tasks performed by the Site Reliability Engineering organization including managing Kubernetes, creating IaC, and leading troubleshooting during incident response • Identify areas, such as performance issues and availability concerns, as well as perform other technical and architectural reviews to partner with fellow engineering teams to improve overall reliability of SentinelOne systems • Design and implement comprehensive monitoring and alerting, as well as concepts such as SLIs/SLOs and critical user journeys to provide deeper insight into the performance and availability of SentinelOne’s systems • Analyze systems, identify toil, and develop and implement strategies such as automation to streamline and optimize SRE’s support of critical systems

Requirements

• 7+ years of experience in Site Reliability Engineering, preferably with a large scale SaaS product or large cloud-based distributed system • 5+ years of production experience with orchestration systems like Kubernetes, Nomad or Mesos • Experience with a scripting language, such as Python, Golang, Java, or Ruby • Familiarity with running Java and JavaScript applications, including build and deploy • AWS experience, and familiarity with other platforms like GCP • Experience using Infrastructure as Code (IaC) to setup cloud-native services • Familiarity with CI and practical delivery using Jenkins, GHA, ArgoCD, etc. or similar; familiarity with deployment strategies like blue-green, rolling deploys, canary deploys, and best practices around deployment automation • Curiosity, fast-learning, and great communication skills • Preferred: 2+ years of experience in a FedRAMP environment • Ability to work in a diverse and distributed team • Self-starter attitude, with passion for new technologies and empathy for legacy systems • Ability to learn quickly, and navigate through unfamiliar programming languages, systems, and processes

Benefits

• Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA • Unlimited PTO • Industry-leading gender-neutral parental leave • Paid company holidays • Paid sick time • Employee stock purchase program • Disability and life insurance • Employee assistance program • Gym membership reimbursement • Cell phone reimbursement • Numerous company-sponsored events including regular happy hours and team-building events

Apply Now

Similar Jobs

August 10

Cribl

501 - 1000

Improve service reliability and deliver solutions for observability data.

🇺🇸 United States – Remote

💵 $152k - $230.5k / year

⏰ Full Time

🔴 Lead

👨🏻‍🔧 Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

July 24

NMI

201 - 500

Lead and mentor distributed teams for NMI's Site Reliability Engineering operations.

🇺🇸 United States – Remote

💵 $120k - $155k / year

⏰ Full Time

🔴 Lead

🟠 Senior

👨🏻‍🔧 Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

July 19

Ensure reliability and scalability of a text-based mental health support platform.

🇺🇸 United States – Remote

💵 $126k - $162.5k / year

💰 $23.8M Series B on 2016-06

⏰ Full Time

🔴 Lead

👨🏻‍🔧 Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

June 18

Ensure reliability and performance of blockchain infrastructure at Movement Labs.

🇺🇸 United States – Remote

💰 Seed Round on 2018-07

⏰ Full Time

🔴 Lead

👨🏻‍🔧 Site Reliability Engineer (SRE)

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com