Senior Solution Architect - HPC and AI

Yesterday

Apply Now
Logo of NVIDIA

NVIDIA

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Description

• Build robust AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, training stability, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement. • Understand the AI workload and how it interacts with other parts of the system like networking, storage, deep learning frameworks, data cleaning tools, etc. • Maintain services once they are live by measuring and monitoring progress of AI jobs and help engineering design solutions for more robust training at scale. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.

Requirements

• BS/MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields with at least 8 years work or research experience with Python/ C++ / other software development. • Track record of medium to large scale AI training and understanding of key libraries used for NLP/LLM/VLA training (NeMo Framework, DeepSpeed etc.) • Experience with integration and deployment of software products in production enterprise environments, and microservices software architecture. • You are excited to work with multiple levels and teams across organisations (Engineering, Product, Sales and Marketing team) • Capable of working in a constantly evolving environment without losing focus. • Ability to multitask in a fast-paced environment. • Driven with strong analytical and problem-solving skills. • Strong time-management and organization skills for coordinating multiple initiatives, priorities and implementations of new technology and products into very sophisticated projects. • You are a self-starter with demeanour for growth, passion for continuous learning and sharing findings across the team. • Technical leadership and strong understanding of NVIDIA technologies, and success in working with customers. • Excellent verbal, written communication, and technical presentation skills in English.

Apply Now

Similar Jobs

November 19

Senior Solutions Engineer role supporting clients in the Swiss market at Cloudflare. Engage with technology to solve client challenges and improve internet performance.

November 19

Join Cloudflare as a Senior Solutions Engineer for enterprise clients in Switzerland, focusing on networking and security solutions.

November 19

Join Cloudflare as a Senior Solutions Engineer, working with enterprise clients in Switzerland. Leverage technology expertise to solve complex problems for customers in a diverse team.

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com