Syndica

Website LinkedIn All Job Openings

Solana • Infrastructure • RPC infrastructure • Blockchain

2 - 10

Site Reliability Engineer

July 23

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗽 H1B Visa Sponsor

Ansible

AWS

Azure

Chef

Cloud

DNS

Docker

Google Cloud Platform

Grafana

JMeter

Kubernetes

Prometheus

Python

Rust

SDLC

Terraform

TypeScript

Web3

Apply Now

Syndica

Website LinkedIn All Job Openings

Solana • Infrastructure • RPC infrastructure • Blockchain

2 - 10

Description

• Administer overall site availability, security, latency, and system health. • Effective provisioning, installation/configuration, operation, and maintenance of services and system software and related infrastructure. • Develop comprehensive monitoring solutions to provide full visibility to the different system components using tools like Kubernetes, Prometheus, Grafana, ELK, Datadog, New Relic, etc. • Enable the development team to release code quickly and reliably by ensuring full observability of systems and automated detection of performance and integration issues. • Formulate technical performance measures and implement them using queries, logs, code instrumentation and other analytics tools. • Design dashboards and visualizations that effectively convey technical measures • Troubleshoot issues at multiple layers of deployment, from hardware, to operating environment, network, and application to conduct root cause analysis and make recommendations from your findings. • Work with development teams to ensure best practices for scalability, reliability, and security are designed and implemented from the start. • Forecast changes in demand and capacity to establish appropriate scalability plans and drive decisions on the right-sizing of servers, storage and other resources. • Design and perform high-throughput stress testing to determine system capacity limits and identify points of failure. • Troubleshoot critical customer issues related to Syndica’s RPC, APIs, and App Deployments.

Requirements

• Great collaborator with 5+ years of experience in a DevOps or SRE role • Proficiency in scripting languages (Python, Shell) and experience with at least one modern programming language (Go, Rust, Typescript, etc.) • Experience deploying large-scale systems reliably • Experience using Kubernetes • Working knowledge of web and network protocols and standards (HTTP, TLS, DNS, etc) • Working knowledge of information security issues • Experience writing automation tools & eagerness to 'automate all the things' • Commitment to implementing reliability and security best practices • Capacity planning experience, including resource optimization and load testing • Systematic problem-solving approach, combined with a strong sense of ownership and drive • Experience with Prometheus/Grafana for metrics aggregation/visualization and other monitoring and alerting tools • Experience with infrastructure-as-code tools such as Terraform, Ansible, Chef • Experience in Building and managing Virtualized systems (KVM, OVM, Containers/Docker) and ability to read and understand source code • Knowledge of one or more load testing tools (K6, Locust, JMeter, etc.) • Experience with configuration of CI/CD pipelines

Apply Now