Lightspeed (Formerly SEOshop)

Website LinkedIn All Job Openings

Webshop platform • E-commerce • hosting • customer support • apps

1 -

Senior Site Reliability Engineer

4 days ago

🇬🇧 United Kingdom – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability (SRE)

AWS

Azure

Backbone

Bash

Cassandra

Cloud

Google Cloud Platform

Grafana

Kubernetes

MongoDB

MySQL

NoSQL

Postgres

Prometheus

Python

Redis

SQL

Terraform

Apply Now

Lightspeed (Formerly SEOshop)

Website LinkedIn All Job Openings

Webshop platform • E-commerce • hosting • customer support • apps

1 -

Description

• Design, build, and maintain robust infrastructure on GCP, leveraging cloud-native technologies such as GKE, Cloud SQL, BigQuery, etc. • Develop and manage CI/CD pipelines for efficient deployment and release using various technologies (GitLab, GitHub, Helm, Terraform, etc.) • Work closely with development teams to provide tools and practices for monitoring software health in production, defining and measuring reliability metrics (SLI, SLO), and managing error budgets • Build platform solutions and apply software engineering principles to improve software reliability and accelerate delivery • Support the incident management process and conduct post-mortem analysis to prevent future outages • Mentor junior SREs and developers, offering guidance on best practices in cloud architecture, data management, and software development • Manage infrastructure changes through infrastructure as code (IaC) using Terraform • Participate in the on-call rotation • Stay current with industry trends and emerging technologies, advocating for the adoption of new technologies and practices to improve product quality and team efficiency

Requirements

• Bachelor’s degree in Computer Science, Engineering, or equivalent real-world experience • 6+ years of experience in site reliability engineering, systems administration, and/or software engineering • Expertise in container orchestration platforms, specifically Kubernetes • Strong understanding of both relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis) • Familiarity with network protocols and IP networking, along with experience in network troubleshooting • Proficiency in at least one programming language such as Bash, Python, Go, etc. • Proven track record of managing large-scale infrastructure in cloud environments like Google Cloud, AWS, or Azure • Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack) • Strong understanding of security best practices • Excellent problem-solving skills and the ability to work under pressure to troubleshoot and resolve complex issues • Excellent communication skills for effective collaboration with cross-functional teams • A keen eagerness to learn and embrace challenges

Benefits

• Work in a talented global team with strong role growth opportunities • Flexible Working policy • Lightspeed share scheme (we are all owners) • Company pension program • Private medical insurance • Health and wellness benefit • Mental health online platform and counseling & coaching services • Paid leave and assistance for new parents • Language classes & LinkedIn Learning license • Volunteer day

Apply Now