Site Reliability Engineer

September 17

Apply Now
Logo of TRG Research and Development

TRG Research and Development

First Responders β€’ Cybersecurity β€’ Big Data β€’ Cyber Intelligence β€’ Research and Development

Description

β€’ Collaborate with Customer Support and DevOps teams to establish SLA, SLO, and SLI β€’ Maintain 24/7 production stability year-round β€’ Deploy, configure, and monitor production environments β€’ Automate production deployments, validations, and reporting processes β€’ Develop and maintain tools for production operations β€’ Manage and document incidents β€’ Develop disaster recovery automation β€’ Handle MTTR and MTTD metrics β€’ Implement strategies to ensure 100% application uptime β€’ Work with development and QA teams to enhance code quality

Requirements

β€’ At least 2 years of experience in a similar role (DevOps, SRE, System Engineer) β€’ Experience with IaC practices (Terraform) β€’ Experience with Docker and Kubernetes β€’ Experience with one of the major cloud providers (AWS, Azure) β€’ Worked with Linux Administrative Skills β€’ Proven work experience with Python β€’ Excellent problem-solving and communication skills β€’ Willingness to understand the business logic and impacts of components

Benefits

β€’ Working from home β€’ Flexible hours β€’ Yearly performance bonus β€’ Paid medical insurance β€’ Daily lunch allowance β€’ Sport/Gym(Exercise) allowance β€’ Udemy unlimited subscription β€’ Onboarding plan and training β€’ Equipment support β€’ No dress code β€’ Gifts and rewards for celebrations β€’ Happy hours and online team building β€’ Fresh fruit, snacks, coffee, and tea

Apply Now
Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com