Senior On-Device Model Inference Optimization Engineer

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Senior On-Device Model Inference Optimization Engineer

November 28

🇺🇸 United States – Remote

💵 $220k - $339.3k / year

⏰ Full Time

🟠 Senior

🦅 H1B Visa Sponsor

Cloud

Python

PyTorch

Apply Now

NVIDIA

GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Description

• Develop and implement strategies to optimize AI model inference for on-device deployment. • Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands. • Optimize performance-critical components using CUDA and C++. • Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs. • Benchmark inference performance, identify bottlenecks, and implement solutions. • Research and apply innovative methods for inference optimization. • Adapt models for diverse hardware platforms and operating systems with varying capabilities. • Create tools to validate the accuracy and latency of deployed models at scale with minimal friction. • Recommend and implement model architecture changes to improve the accuracy-latency balance.

Requirements

• MSc or PhD in Computer Science, Engineering, or a related field, or equivalent professional experience. • Over 5 years of confirmed experience specializing in model inference and optimization. • 10+ years of work experience in a relevant area • Expertise in modern machine learning frameworks, particularly PyTorch, ONNX, and TensorRT. • Proven experience in optimizing inference for transformer and convolutional architectures. • Strong programming proficiency in CUDA, Python, and C++. • In-depth knowledge of optimization techniques, including quantization, pruning, distillation, and hardware-aware neural architecture search. • Skilled in building and deploying scalable, cloud-based inference systems. • Passionate about developing efficient, production-ready solutions with a strong focus on code quality and performance. • Meticulous attention to detail, ensuring precision and reliability in safety-critical systems. • Strong collaboration and communication skills for working optimally across multidisciplinary teams. • A proactive, diligent mentality with a drive to tackle complex optimization challenges.

Benefits

• Equity • Benefits

Apply Now

Similar Jobs

Senior Project Engineer

November 27

Canadian Solar Inc.

10,000+ employees

⚡ Energy

Lead engineering tasks for battery storage projects while ensuring successful execution and technical support.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

Flash

Senior CyberArk Engineer

November 27

MBL Technologies Inc.

11 - 50

Join MBL Technologies to support IAM projects and enhance identity management as a CyberArk Engineer.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

AWS

Azure

Cloud

Cyber Security

JavaScript

Splunk

SQL

Senior Reverse Engineer

November 27

Automattic

1001 - 5000

☁️ SaaS

🛍️ eCommerce

🌐 Web 3

Join Automattic as a Senior Reverse Engineer handling reverse engineering for applications.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🦅 H1B Visa Sponsor

Senior Reverse Engineer

November 27

Automattic

1001 - 5000

🌐 Web 3

🤝 Non-profit

Join Beeper as a Senior Reverse Engineer to connect messaging platforms and enable communication.

🇺🇸 United States – Remote

💵 $70k - $170k / year

💰 Funding Round on 2021-05

⏰ Full Time

🟠 Senior

🦅 H1B Visa Sponsor

Open Source

WordPress

Senior ServiceNow Engineer

November 27

Enova International

1001 - 5000

💳 Fintech

🤝 B2B

👥 B2C

As a Senior ServiceNow Engineer, design, implement and support Enova’s ServiceNow environment. Collaborate in a team setting while providing tier 2 support in a 24x7 environment.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

AWS

Cloud

DNS

ITSM

ServiceNow