GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming
November 28
GPU-accelerated computing • artificial intelligence • deep learning • virtual reality • gaming
• Develop and implement strategies to optimize AI model inference for on-device deployment. • Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands. • Optimize performance-critical components using CUDA and C++. • Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs. • Benchmark inference performance, identify bottlenecks, and implement solutions. • Research and apply innovative methods for inference optimization. • Adapt models for diverse hardware platforms and operating systems with varying capabilities. • Create tools to validate the accuracy and latency of deployed models at scale with minimal friction. • Recommend and implement model architecture changes to improve the accuracy-latency balance.
• MSc or PhD in Computer Science, Engineering, or a related field, or equivalent professional experience. • Over 5 years of confirmed experience specializing in model inference and optimization. • 10+ years of work experience in a relevant area • Expertise in modern machine learning frameworks, particularly PyTorch, ONNX, and TensorRT. • Proven experience in optimizing inference for transformer and convolutional architectures. • Strong programming proficiency in CUDA, Python, and C++. • In-depth knowledge of optimization techniques, including quantization, pruning, distillation, and hardware-aware neural architecture search. • Skilled in building and deploying scalable, cloud-based inference systems. • Passionate about developing efficient, production-ready solutions with a strong focus on code quality and performance. • Meticulous attention to detail, ensuring precision and reliability in safety-critical systems. • Strong collaboration and communication skills for working optimally across multidisciplinary teams. • A proactive, diligent mentality with a drive to tackle complex optimization challenges.
• Equity • Benefits
Apply NowNovember 27
Lead engineering tasks for battery storage projects while ensuring successful execution and technical support.
November 27
11 - 50
Join MBL Technologies to support IAM projects and enhance identity management as a CyberArk Engineer.
November 27
Join Automattic as a Senior Reverse Engineer handling reverse engineering for applications.
November 27
Join Beeper as a Senior Reverse Engineer to connect messaging platforms and enable communication.
🇺🇸 United States – Remote
💵 $70k - $170k / year
💰 Funding Round on 2021-05
⏰ Full Time
🟠 Senior
🦅 H1B Visa Sponsor
November 27
As a Senior ServiceNow Engineer, design, implement and support Enova’s ServiceNow environment. Collaborate in a team setting while providing tier 2 support in a 24x7 environment.