Principal Site Reliability Engineer

October 20

Apply Now
Logo of Global InfoTek, Inc.

Global InfoTek, Inc.

Full Spectrum Cyber β€’ Rapid Development β€’ Agile Software Development β€’ Advanced Technology R&D β€’ Enhancement and Innovation

51 - 200

Description

β€’ The Site Reliability Engineer (SRE) must be able to build and maintain infrastructure as code on large scale multi-site deployments. β€’ The SRE will utilize their experience to evaluate and assess new ways to scale platform capabilities. β€’ The SRE must be able to automate workflows to help push the limit of the infrastructure and enable continuous delivery of capabilities onto a hybrid infrastructure. β€’ The engineer will troubleshoot issues until root causes are understood on high traffic production systems, participate in design and code review processes, interact with product owners to coordinate infrastructure changes and be responsible for identifying bottlenecks and improving performance of the platform.

Requirements

β€’ Bachelor's degree in computer science, Mathematics, or equivalent technical degree; or equivalent industry experience. β€’ Three-plus (3+) years of experience developing production software leveraging modern languages (including: Java, Python, Go, NodeJS, etc.) β€’ One-plus (1+) years of experience developing containerized services deployed in production on orchestration platforms such as Kubernetes, Mesos, Swarm, etc. β€’ Three-plus (3+) years of experience with agile and lean software development philosophies. β€’ One-plus (1+) years of experience working with relational and/or non-relational databases e.g. PostgreSQL, MySQL, MongoDB, Elasticsearch etc. β€’ Two-plus (2+) years of demonstrated experience with modern version control systems such as Git, Subversion, Mercurial, etc. β€’ Clearable to a SECRET or above security clearance β€’ CompTIA Security+ CE or other DoD 8570 IAT II certification, within 90 days β€’ Five plus (5+) years, building and maintaining Kubernetes clusters across hybrid-cloud infrastructure β€’ Eight-plus (8+) years of experience working in Operations, DevOps, or Site Reliability Engineering β€’ Five-plus (5+) years in configuration / package management experience using tools like Terraform, Helm etc. β€’ Five-plus (5+) years' experience with Cloud service monitoring like Prometheus, Grafana, FluentD, ElasticStack, Prometheus, SumoLogic, etc. β€’ Exceptionally proficient (knowledge and work experience) in Linux system administration β€’ Ability to assist with GitLab CI pipelines (build/promote artifacts and security scans) β€’ Experience creating automation using APIs from Azure or Google Cloud

Apply Now

Similar Jobs

October 17

interface.ai

51 - 200

Lead DevOps at interface.ai, an AI provider for financial institutions.

October 15

Vultr

51 - 200

Drive automation as Staff Site Reliability Engineer at Vultr's cloud platform.

πŸ‡ΊπŸ‡Έ United States – Remote

πŸ’΅ $120k - $135k / year

⏰ Full Time

πŸ”΄ Lead

β›‘ DevOps & Site Reliability Engineer (SRE)

October 10

Lead technology team at Mercy For Animals to achieve its mission.

πŸ‡ΊπŸ‡Έ United States – Remote

πŸ’΅ $111k - $115k / year

⏰ Full Time

πŸ”΄ Lead

β›‘ DevOps & Site Reliability Engineer (SRE)

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com