3 days ago
🇺🇸 United States – Remote
⏳ Contract/Temporary
🟡 Mid-level
🟠 Senior
🖥 Software Engineer
🗽 H1B Visa Sponsor
• Responsible for designing and implementing chaos engineering practices to enhance resilience and reliability. • Develop and execute strategies, including chaos experiments and drills, to identify weaknesses. • Leverage cloud platform experience to implement chaos experiments simulating failure scenarios. • Partner with teams to integrate practices into CI/CD pipeline, fostering a culture of reliability. • Utilize observability tools to monitor, analyze, and understand system performance. • Create documentation for methodologies and conduct training sessions for best practices. • Analyze results from experiments to drive improvements in design and operational practices. • Collaborate with incident response teams to refine management processes.
• Bachelor’s degree in Computer Science, Engineering, or a related field; advanced degree preferred. • 5+ years of experience in software engineering, systems architecture, or related fields. • Proven experience with chaos engineering principles and practices in cloud environments. • Familiarity with chaos engineering tools (e.g., Gremlin, Chaos Monkey, Litmus) and observability platforms. • Strong knowledge of cloud computing architectures (AWS, Azure, GCP). • Proficiency in programming/scripting languages (Python, Go, Java, etc.) for automation of chaos experiments. • Experience with observability tools (e.g., Prometheus, Grafana, Datadog) to derive insights from chaos tests. • Excellent problem-solving skills and ability to think critically under pressure. • Strong communication skills to effectively share insights and findings with technical and non-technical stakeholders. • Ability to work collaboratively in a fast-paced, agile environment. • Experience with site reliability engineering (SRE) practices preferred. • Familiarity with microservices architectures and container orchestration (e.g., Kubernetes) preferred. • Understanding of incident response and disaster recovery planning preferred.
Apply Now6 days ago
51 - 200
Join MSRcosmos as a Talend Developer utilizing data integration skills in a remote setting.
6 days ago
51 - 200
Explore Army career pathways in construction and engineering as a bridge crewmember. Opportunities available with training and benefits.
6 days ago
11 - 50
Join the SAFe Agile team as a GIS Developer for a conservation data platform. Work with geospatial datasets and develop applications for data delivery.
November 28
Seeking a Mid Level Developer with backend Go/Postgres or frontend React expertise for remote work.
November 28
Join ECP to develop mobile applications that enhance senior living community care.