NVIDIA

Website LinkedIn All Job Openings

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Senior Solution Architect - HPC and AI

November 8, 2024

🇸🇬 Singapore – Remote

⏰ Full Time

🟠 Senior

💻 Solutions Engineer

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

Kubernetes

Microservices

Python

Terraform

Apply Now

NVIDIA

Website LinkedIn All Job Openings

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Description

• Primary responsibilities will include building robust AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, training stability, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement. • Your primary focus would be on understanding the AI workload and how it interacts with other parts of the system like networking, storage, deep learning frameworks, data cleaning tools, etc. • Help maintain services once they are live by measuring and monitoring progress of AI jobs and helping engineering design solutions for more robust training at scale. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.

Requirements

• BS/MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields with at least 8 years work or research experience with Python/ C++ / other software development. • Track record of medium to large scale AI training and understanding of key libraries used for NLP/LLM/VLA training (NeMo Framework, DeepSpeed etc.) • Experience with integration and deployment of software products in production enterprise environments, and microservices software architecture. • You are excited to work with multiple levels and teams across organisations (Engineering, Product, Sales and Marketing team) • Capable of working in a constantly evolving environment without losing focus. • Ability to multitask in a fast-paced environment. • Driven with strong analytical and problem-solving skills. • Strong time-management and organization skills for coordinating multiple initiatives, priorities and implementations of new technology and products into very sophisticated projects. • You are a self-starter with demeanour for growth, passion for continuous learning and sharing findings across the team. • Technical leadership and strong understanding of NVIDIA technologies, and success in working with customers. • Excellent verbal, written communication, and technical presentation skills in English.

Apply Now