NVIDIA

Website LinkedIn All Job Openings

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+

Senior Deep Learning Systems Software Engineer - AI Infrastructure

September 15

🇺🇸 United States – Remote

💵 $180k - $339.3k / year

⏰ Full Time

🟠 Senior

🗣️ LLM Engineer

🗽 H1B Visa Sponsor

Benchmarks

Cloud

Distributed Systems

Node.js

Python

PyTorch

Apply Now

NVIDIA

Website LinkedIn All Job Openings

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+

Description

• Understand, analyze, profile, and optimize deep learning workloads on state-of-the-art hardware and software platforms. • Build tools to automate workload analysis, workload optimization, and other critical workflows. • Collaborate with cross-functional teams to analyze and optimize cloud application performance on diverse GPU architectures. • Identify bottlenecks and inefficiencies in application code and propose optimizations to enhance GPU utilization. • Drive end-to-end platform optimization from a hardware level to the application and service levels • Design and implement performance benchmarks and testing methodologies to evaluate application performance. • Provide guidance and recommendations on optimizing cloud-native applications for speed, scalability, and resource efficiency. • Share knowledge and best practices with domain expert teams as they transition applications to distributed environments.

Requirements

• Masters in CS, EE or CSEE or equivalent experience • 8+ years of experience in application performance engineering • Experience using large scale multi node GPU infrastructure on premise or in CSPs • Background in deep learning model architectures and experience with Pytorch and large scale distributed training • Experience with application profiling tools such as NVIDIA NSight, Intel VTune etc. • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. • Experience with NVIDIA's Infrastructure and software stacks. • Proven experience analyzing, modeling and tuning DL application performance. • Proficiency in Python and C/C++ for analyzing and optimizing application code