Site Reliability Engineer

21 hours ago

Apply Now
Logo of Replicant

Replicant

Artificial Intelligence β€’ Telephony β€’ Machine Learning β€’ Conversational AI β€’ TTS

51 - 200

Description

β€’ Ensure the smooth operation and high availability of Replicant's production systems β€’ Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency β€’ Develop and maintain tools and automation to prevent and quickly resolve incidents β€’ Collaborate with engineering teams to improve the reliability and scalability of our applications and infrastructure β€’ Participate in on-call rotation to address production issues and ensure service uptime β€’ Contribute to infrastructure design and implementation, focusing on scalability, security, and cost-effectiveness β€’ Stay up-to-date on industry best practices and emerging technologies in SRE and DevOps

Requirements

β€’ Proven experience in managing and troubleshooting complex, distributed systems in a production environment β€’ Strong understanding of cloud platforms (GCP preferred) and containerization technologies (Kubernetes) β€’ Proficiency in scripting languages and automation tools (e.g., Python, Bash, Terraform) β€’ Experience with monitoring and observability systems (e.g., Datadog, Prometheus) β€’ Excellent problem-solving skills and a proactive approach to identifying and mitigating potential issues β€’ Strong communication and collaboration skills, with the ability to work effectively in a team environment β€’ A passion for ensuring the reliability and performance of critical systems β€’ Bonus Points: Experience with CI/CD pipelines and infrastructure-as-code practices, Knowledge of networking concepts and protocols, Familiarity with security best practices for cloud-based systems, Familiarity with telephony applications

Benefits

β€’ Remote working environment that respects time zone differences β€’ Highly competitive salaries, equity, and for US Employees, a 401(k) plan β€’ Top of the line healthcare (medical, vision, and dental) β€’ Health and Wellness Perk β€’ Equipment Stipend β€’ Flexible vacation policy β€’ Amazing team trips & offsites where you can find our CEO baking bread for the team β€’ Replicants are eligible for a 5-week sabbatical after being at the company for 4.5 years

Apply Now

Similar Jobs

23 hours ago

Kbit

11 - 50

Ensure reliability of high-frequency cryptocurrency trading systems as a Site Reliability Engineer.

2 days ago

Leidos

10,000+

Leidos seeks a Mid-Level DevOps Administrator to enhance DoD mission planning.

πŸ‡ΊπŸ‡Έ United States – Remote

πŸ’΅ $68.9k - $124.5k / year

⏰ Full Time

🟑 Mid-level

🟠 Senior

β›‘ DevOps & Site Reliability Engineer (SRE)

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com