neptune.ai

Website LinkedIn All Job Openings

Machine learning • Data science • MLOps • Experiment tracking • Model registry

11 - 50 employees

🤖 Artificial Intelligence

☁️ SaaS

Staff Site Reliability Engineer

November 15

🇪🇺 Anywhere in Europe – Remote

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Distributed Systems

ElasticSearch

Google Cloud Platform

GRPC

Java

Kafka

Kotlin

Kubernetes

MySQL

Python

Redis

Rust

Scala

Spring

Terraform

WordPress

Apply Now

neptune.ai

Website LinkedIn All Job Openings

Machine learning • Data science • MLOps • Experiment tracking • Model registry

11 - 50 employees

🤖 Artificial Intelligence

☁️ SaaS

Description

• We are seeking an experienced Staff Site Reliability Engineer to join our fully remote team. • As a key player in our Engineering team, you will contribute to infrastructure design and optimization. • Have an impact on the scalability, resilience, and performance of Neptune solutions. • Ownership of Site Reliability Process: Own the site reliability process and systems through all stages. • Ensure the scalability, resilience, and performance of Neptune solutions across global SaaS and client-hosted environments. • Design and implement automation workflows to streamline deployments, upgrades, and incident response. • Ensure infrastructure and processes meet security and industry standards, protecting sensitive data. • Partner with development, product, customer success, and client teams to deliver robust solutions. • Document architecture, operational procedures, and troubleshooting guides to enable knowledge sharing. • Participate in on-call rotations to maintain system uptime and performance.

Requirements

• 6+ years in SRE, DevOps, or related roles. • Strong experience managing and optimizing Kubernetes clusters for robust, scalable, and efficient infrastructure. • Proven expertise in designing and implementing automation solutions for infrastructure and application deployment, with experience in Terraform, Helm, and GitLab CI/CD. • Strong programming skills in Shell and Python. • Extensive experience with Linux system administration and network management. • Expertise in managing distributed computing systems and near real-time data streaming platforms. • Fluency in English, with solid communication skills for interacting with global customers. • Nice to have: Experience in security best practices, compliance standards (e.g., SOC 2), and infrastructure hardening. • Nice to have: Experience with multi-cloud architecture and cloud-native technologies. • Nice to have: Experience in high-traffic, petabyte-scale data environments. • Nice to have: Experience with ClickHouse and Kafka deployments.

Benefits

• Flexibility: 100% remote work with offices (co-works) in Warsaw/Wrocław/Poznań/Kraków available and flexible working hours. • Share in our success: Participate in the Employee Stock Option Plan and be part of our growth journey. • Time off: 20 paid service-free days per year. • Ownership and impact: Space to take action, bring your ideas to life, and make a real impact.

Apply Now

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com