Staff Site Reliability Engineer

October 15

🇺🇸 United States – Remote

💵 $120k - $135k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

Apply Now
Logo of Vultr

Vultr

Cloud computing • IT • Data center

51 - 200

Description

• Collaborate with cross-functional teams to craft and implement a modern observability stack and refine our incident-handling processes. • Design and contribute to state-of-the-art cloud provider solutions for high-performance computing, AI training, and inference workloads, focusing on Observability and MLOps. • The platform team aims to enhance the resilience and stability of our systems through thoughtful software improvements, architecture, and automation. • Contribute to solutions for various challenges ranging in nature from low-level hardware issues to high-level distributed application scale challenges and everything in between. • Champion DevOps and SRE principles through automation, thought leadership, and close collaboration within our engineering team. • Enhance customer experience by improving case handling—strive for proactive responses, rich insights, and automated resolutions. • Develop robust documentation to streamline the handling of recurring reliability issues, paving the way for junior SREs to take the helm confidently. • Identify and implement scalable solutions to address technical challenges within our stack, setting new benchmarks for innovation.

Requirements

• 3+ years of experience in a hands-on SRE role delivering distributed architectures. • 2+ years working with and maintaining Kubernetes clusters for highly available and regulated environments. • 2+ years of hands-on experience with a modern Grafana stack, including Mimir, Loki, and Tempo. • Comfortable working with complex CI/CD Pipelines (Gitlab/Jenkins), configuration management (Puppet/Salt), and IaC solutions such as Terraform • Experience working with observability pipelines or Open Telemetry is a plus. • A background in performance optimization for Webstacks, including components such as PHP-FPM, Ningx, and Mysql • Boasts strong programming chops in Python, Golang, or PHP and thrives when picking up new technologies.

Benefits

• A 100% remote work environment + a company-wide virtual get together • 401(k) plan that matches 100% up to 4% with immediate vesting • Professional Development Reimbursement of $2,500 each year • 11 Holidays + Paid Time Off Accrual + Rollover Plan + take off your birthday! • Commitment matters to Vultr! Increased PTO at 3 year anniversary + 1 month sabbatical at 5 year anniversary + Anniversary Bonus each year • $500 first year remote office setup + $400 each year following for new equipment • Monthly internet reimbursement up to $75 • $50 per month for a gym membership

Apply Now

Similar Jobs

October 10

Lead technology team at Mercy For Animals to achieve its mission.

🇺🇸 United States – Remote

💵 $111k - $115k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

September 20

Implementing and managing cloud infrastructure, focusing on Azure services.

🇺🇸 United States – Remote

⏰ Full Time

🔴 Lead

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

September 20

Lilt

51 - 200

DevOps engineer to enhance Lilt's AI platform and infrastructure.

🇺🇸 United States – Remote

💰 $55M Series C on 2022-04

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

September 20

GEICO

10,000+

Lead engineering for GEICO's transformation into a tech organization.

🇺🇸 United States – Remote

💵 $105k - $230k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com