GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming
10,000+
November 7
GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming
10,000+
• Develop and maintain continuous integration and delivery pipelines. • Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources. • Deploy monitoring solutions for the servers, network and storage. • Perform troubleshooting bottom up from bare metal, operating system, software stack, and application level. • Being a technical resource, develop, re-define and document standard methodologies to share with internal teams. • Support Research & Development activities and engage in POCs/POVs for future improvements.
• BS/MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields with at least 8 years of work or research experience in networking fundamentals, TCP/IP stack, and data center architecture. • 8+ Years of design, implementation and maintenance of large scale HPC/AI clusters with monitoring, logging and alerting. • Manage Linux job/workload schedulers and orchestration tools. • Knowledge of HPC and AI solution technologies from CPUs and GPUs to high-speed interconnects and supporting software. • Direct design, implementation, and management experience with cloud computing platforms (e.g. AWS, Azure, Google Cloud). • Experience with job scheduling workloads and orchestration technologies such as Slurm, Kubernetes, and Singularity. • Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalld, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc. • Experience with multiple storage solutions such as Lustre, GPFS, zfs, and xfs. • Familiarity with newer and emerging storage technologies. • Python programming and bash scripting experience. • Comfortable with automation and configuration management tools including Jenkins, Ansible, Puppet/Chef, etc. • Deep knowledge of Networking Protocols like InfiniBand, Ethernet. • Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix). • Strong written, verbal, and listening skills in English are critical.
Apply NowOctober 26
51 - 200
Onboarding Engineer to define and build solution growth for Brinqa's cyber risk software.
🇮🇳 India – Remote
💰 Private Equity Round on 2021-06
⏰ Full Time
🟡 Mid-level
🟠 Senior
💻 Solutions Engineer
October 26
51 - 200
Design and implement solutions for Brinqa’s cyber risk management software.
🇮🇳 India – Remote
💰 Private Equity Round on 2021-06
⏰ Full Time
🟡 Mid-level
🟠 Senior
💻 Solutions Engineer
October 20
2 - 10
Design IT architecture at Forhyre, manage customer engagements in Microsoft Azure.
🇮🇳 India – Remote
💵 ₹2.5M - ₹3.5M / year
💰 Seed Round on 2016-03
⏰ Full Time
🟡 Mid-level
🟠 Senior
💻 Solutions Engineer