Senior Reliability Engineer

November 21

Apply Now
Logo of Americor

Americor

Debt Resolution β€’ Debt Analysis β€’ Credit Card Debt β€’ Credit Counseling

Description

β€’ Ensure the reliability of infrastructure supporting mission-critical services, minimizing downtime and optimizing performance. β€’ Proactively monitor, respond to, diagnose, and resolve incidents, improving response time and minimizing customer impact. β€’ Work closely with Russian-speaking developers, as well as QA and system analysts. β€’ Enhance CI/CD pipelines, monitoring tools, and automation processes to streamline workflows and increase system efficiency. β€’ Keep infrastructure-related documentation up to date. β€’ Hosted in MS Azure, AWS, but mainly OVHcloud (US) OVHcloud contains Bare Metal and VMs OS: CentOS / AlmaLinux OS. Components: Nginx, KeyDB/Redis, OpenSearch. Database: MariaDB/MySQL, Percona / Galera Cluster, ProxySQL, Maxscale. Storage: GlusterFS. Networking: HAProxy, VyOS, iptables. Language: PHP 8 (PHP-FPM, Yii2, Symfony, Laravel, OPcache). Monitoring tools: Datadog, Vector, Sentry. IaC: Terraform, Ansible. Alerting: OpsGenie.

Requirements

β€’ 5+ years of experience in a Site Reliability Engineering role, with a proven track record of maintaining high-availability infrastructure in a high-load environment. β€’ Expertise in Linux systems and web stacks (Nginx, PHP, MySQL/MariaDB, Redis/KeyDB) to ensure smooth and efficient operation. β€’ Strong experience with MySQL/MariaDB Galera cluster and Gluster storage to optimize data reliability and scalability. β€’ Deep knowledge of network architectures, including TCP/IP, DNS, VPNs, and load-balancing techniques, with hands-on experience in troubleshooting and optimizing network performance. β€’ Proficiency in PHP and Docker for seamless integration and deployment of services. β€’ Solid understanding of CI/CD and security best practices. β€’ Understanding of Infrastructure-as-Code, Monitoring-as-Code, and GitOps (we use Ansible and Terraform). β€’ Experience with Cloudflare and AWS services (EKS, S3, OpenSearch). β€’ Experience building fault-tolerant systems and compliance audits (SOC, FFIEC, etc.). β€’ Familiarity with Jira and Agile software development. β€’ Familiarity with modern container orchestration and deployment tools (Kubernetes, Helm). β€’ Fluent in Russian (reading, writing and speaking).

Benefits

β€’ Ongoing training and development β€’ Opportunity for career advancement β€’ Medical β€’ Dental β€’ Vision β€’ Company Paid Group Life / AD&D Insurance β€’ 7 Paid Holidays and 2 Floating Holiday Days to use at will β€’ Paid Time Off β€’ Flexible Spending/HSA β€’ Employee Assistance Program (EAP) β€’ 401(k) match β€’ Referral Program

Apply Now

Similar Jobs

November 19

Manage a team for building and operating reliability tools at Zillow. Drive reliability and scalability across engineering ecosystem.

πŸ‡ΊπŸ‡Έ United States – Remote

πŸ’΅ $201.7k - $322.3k / year

πŸ’° $4.1M Post-IPO Equity on 2012-12

⏰ Full Time

🟠 Senior

πŸ¦… H1B Visa Sponsor

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com