4 days ago
AWS
Azure
Backbone
Bash
Cassandra
Cloud
Google Cloud Platform
Grafana
Kubernetes
MongoDB
MySQL
NoSQL
Postgres
Prometheus
Python
Redis
SQL
Terraform
Go
• Design, build, and maintain robust infrastructure on GCP, leveraging cloud-native technologies such as GKE, Cloud SQL, BigQuery, etc. • Develop and manage CI/CD pipelines for efficient deployment and release using various technologies (GitLab, GitHub, Helm, Terraform, etc.) • Work closely with development teams to provide tools and practices for monitoring software health in production, defining and measuring reliability metrics (SLI, SLO), and managing error budgets • Build platform solutions and apply software engineering principles to improve software reliability and accelerate delivery • Support the incident management process and conduct post-mortem analysis to prevent future outages • Mentor junior SREs and developers, offering guidance on best practices in cloud architecture, data management, and software development • Manage infrastructure changes through infrastructure as code (IaC) using Terraform • Participate in the on-call rotation • Stay current with industry trends and emerging technologies, advocating for the adoption of new technologies and practices to improve product quality and team efficiency
• Bachelor’s degree in Computer Science, Engineering, or equivalent real-world experience • 6+ years of experience in site reliability engineering, systems administration, and/or software engineering • Expertise in container orchestration platforms, specifically Kubernetes • Strong understanding of both relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis) • Familiarity with network protocols and IP networking, along with experience in network troubleshooting • Proficiency in at least one programming language such as Bash, Python, Go, etc. • Proven track record of managing large-scale infrastructure in cloud environments like Google Cloud, AWS, or Azure • Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack) • Strong understanding of security best practices • Excellent problem-solving skills and the ability to work under pressure to troubleshoot and resolve complex issues • Excellent communication skills for effective collaboration with cross-functional teams • A keen eagerness to learn and embrace challenges
• Work in a talented global team with strong role growth opportunities • Flexible Working policy • Lightspeed share scheme (we are all owners) • Company pension program • Private medical insurance • Health and wellness benefit • Mental health online platform and counseling & coaching services • Paid leave and assistance for new parents • Language classes & LinkedIn Learning license • Volunteer day
Apply Now6 days ago
5001 - 10000
Senior DevOps Engineer at NEC Digital, managing critical infrastructure and deployments.
September 26
51 - 200
L2 Cloud Operations Engineer at Revolgy resolving client issues in cloud infrastructure.
🇬🇧 United Kingdom – Remote
💰 Private Equity Round on 2020-06
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability (SRE)
September 26
1001 - 5000
Keyrus is expanding; seeks DevOps Engineer for data platforms management.
🇬🇧 United Kingdom – Remote
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability (SRE)
🇬🇧 UK Skilled Worker Visa Sponsor
Join our Facebook group
👉 Remote Jobs Network