Staff Engineer - Data Reliability Engineering

5 days ago

Apply Now
Logo of GEICO

GEICO

Auto Insurance • RV Insurance • ATV Insurance • Boat Insurance • Motorcycle Insurance

10,000+

Description

• Lead the design and implementation of large-scale, fault-tolerant, and highly available data platforms. • Architect and develop end-to-end data pipelines that ensure the reliability, scalability, and performance of data processing systems. • Drive best practices for data reliability, disaster recovery, monitoring, alerting, and incident management. • Collaborate with cross-functional teams (data engineering, DevOps, SREs) to integrate, test, and improve platform reliability and performance. • Mentor and guide engineers across the organization, promoting a culture of engineering excellence and continuous improvement. • Leverage open-source tools and technologies to enhance platform capabilities, reduce costs, and increase flexibility. • Implement automation strategies for system monitoring, data quality checks, failure recovery, and incident resolution. • Optimize performance and cost efficiency across data infrastructures hosted on major cloud providers (AWS, GCP, Azure) or large-scale private data centers. • Establish and enforce security and compliance standards for data systems.

Requirements

• Expertise in designing and managing large-scale distributed data systems. • Strong knowledge of modern data platforms (e.g. Snowflake, Spark, Kyuubi, Datalake, Kafka, Airbyte, Trino, Flink, Azure Data Factory, Nifi) and related open-source tools. • Hands-on experience with major cloud platforms (Azure, AWS, GCP) or large-scale private data center environments. • Proficiency in programming and scripting (Python, Java, Scala, Go, etc.) for automation, data processing, and systems engineering. • In-depth knowledge of CI/CD practices, containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, Ansible, Puppet, Chef). • Strong understanding of database technologies (SQL, NoSQL) and distributed computing frameworks. • Experience with monitoring, alerting, and troubleshooting tools (Prometheus, Grafana, Log Analytics, Datadog, etc.). • Proven ability to mentor engineers and lead technical initiatives across teams. • Excellent communication skills and ability to work effectively in a fast-paced, cross-functional environment.

Benefits

• Premier Medical, Dental and Vision Insurance with no waiting period** • Paid Vacation, Sick and Parental Leave • 401(k) Plan • Tuition Reimbursement • Paid Training and Licensures

Apply Now

Similar Jobs

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com