Staff Engineer - Data Reliability Engineering

November 11

Apply Now
Logo of GEICO

GEICO

Auto Insurance β€’ RV Insurance β€’ ATV Insurance β€’ Boat Insurance β€’ Motorcycle Insurance

10,000+ employees

Founded 1936

πŸ’Έ Finance

Description

β€’ Lead the design and implementation of large-scale, fault-tolerant, and highly available data platforms. β€’ Architect and develop end-to-end data pipelines that ensure the reliability, scalability, and performance of data processing systems. β€’ Drive best practices for data reliability, disaster recovery, monitoring, alerting, and incident management. β€’ Collaborate with cross-functional teams (data engineering, DevOps, SREs) to integrate, test, and improve platform reliability and performance. β€’ Mentor and guide engineers across the organization, promoting a culture of engineering excellence and continuous improvement. β€’ Leverage open-source tools and technologies to enhance platform capabilities, reduce costs, and increase flexibility. β€’ Implement automation strategies for system monitoring, data quality checks, failure recovery, and incident resolution. β€’ Optimize performance and cost efficiency across data infrastructures hosted on major cloud providers (AWS, GCP, Azure) or large-scale private data centers. β€’ Establish and enforce security and compliance standards for data systems.

Requirements

β€’ Expertise in designing and managing large-scale distributed data systems. β€’ Strong knowledge of modern data platforms (e.g. Snowflake, Spark, Kyuubi, Datalake, Kafka, Airbyte, Trino, Flink, Azure Data Factory, Nifi) and related open-source tools. β€’ Hands-on experience with major cloud platforms (Azure, AWS, GCP) or large-scale private data center environments. β€’ Proficiency in programming and scripting (Python, Java, Scala, Go, etc.) for automation, data processing, and systems engineering. β€’ In-depth knowledge of CI/CD practices, containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, Ansible, Puppet, Chef). β€’ Strong understanding of database technologies (SQL, NoSQL) and distributed computing frameworks. β€’ Experience with monitoring, alerting, and troubleshooting tools (Prometheus, Grafana, Log Analytics, Datadog, etc.). β€’ Proven ability to mentor engineers and lead technical initiatives across teams. β€’ Excellent communication skills and ability to work effectively in a fast-paced, cross-functional environment.

Benefits

β€’ Premier Medical, Dental and Vision Insurance with no waiting period** β€’ Paid Vacation, Sick and Parental Leave β€’ 401(k) Plan β€’ Tuition Reimbursement β€’ Paid Training and Licensures

Apply Now

Similar Jobs

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com