AI Model Optimization Engineer

New

Skills

Benchmarking C++ (14/17/20) CUDA Efficient Attention Techniques LoRA/QLoRA Mixed-Precision Inference Model Optimization Python Quantization (PTQ/QAT) TensorRT

We are seeking an experienced AI Inference Engineer to focus on model optimization and deployment for large-scale models including LLMs and VLMs. This role involves leveraging advanced techniques such as quantization and LoRA for optimizing model performance and deploying them efficiently on edge devices.

Key Responsibilities
  • Optimize large-scale models using quantization techniques (PTQ/QAT) and LoRA/QLoRA.
  • Architect model conversion and compilation pipelines utilizing TensorRT for edge deployment.
  • Conduct parity checks, accuracy recovery, and latency benchmarking against PyTorch and edge binaries.
  • Develop and optimize CUDA kernels and TensorRT plugins to achieve high memory bandwidth and low latency.
  • Produce production-grade concurrent C++ and Python code for real-time edge inference.
Required Skills & Qualifications
  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference (INT8, FP8, INT4, BF16/FP16).
  • Proven experience in optimizing large-scale models with techniques such as KV-cache and Efficient Attention.
  • Extensive knowledge of model conversion/compilation pipelines (TensorRT, TensorRT-LLM) and benchmarking.
  • Proficiency in low-level programming for AI accelerators, including custom CUDA kernels.
  • Production-level coding skills in C++ (14/17/20) and Python for concurrent, memory-safe applications.

No forms. Your profile is generated instantly.

Job Type: Remote

Salary: Not Disclosed

Experience: Entry

Duration: Months

Share this job:

Similar Jobs

Physics AI Research Project

Posted 51 days ago

Advance AI reasoning in physics

Validate and refine AI problem-solving capabilities

Benchmarking phd Remote Collaboration

Management Consulting AI Project

Posted 51 days ago

Benchmark and improve AI model capabilities

Design consulting-style prompts and evaluations

Ai Benchmarking Management Consulting Online Research

Vulnerability Research Engineer

Posted 51 days ago

Improve security detection capabilities in GitLab

Enhance vulnerability research and analysis

Benchmarking Devops Engineer Product Development

Energy Efficiency Account Manager

Posted 51 days ago

Promote energy efficiency concepts and services to customers

Identify cost-effective investments in energy efficiency

Benchmarking

HR Director - Total Rewards

Posted 51 days ago

Lead People Operations and HR functions effectively

Manage Total Rewards and compensation planning efficiently

Benchmarking Compliance Management Finance Technology

Compensation Business Partner

Posted 51 days ago

Collaborate with various stakeholders on compensation issues

Lead benchmarking efforts to ensure competitive compensation levels

Analytical Skills Benchmarking Communication Skills Data Analysis

Strategic Partnership Consultant

Posted 51 days ago

Analyze and advise on strategic partnership frameworks with global card networks.

Develop partner engagement models and contractual approaches.

Benchmarking Financial Analysis Go-to-market Strategies Market Research

Sensor Development Engineering Manager

Posted 51 days ago

Lead a team of Rust engineers for Sensor development

Ensure end-to-end delivery of Sensor features

Benchmarking CI/CD Performance Optimization Rust

Total Rewards Analyst Role

Posted 32 days ago

Maintain compensation records and job data.

Support merit, bonus, and promotion cycles.

AI Tools Benchmarking Benefits Administration Compensation Analysis

BEPS Fellow Position

Posted 33 days ago

Lead outreach to building owners for BEPS support.

Provide energy and emissions reduction services.

Benchmarking Building science Communication skills Data analysis

Model Evaluation Engineer

Posted 34 days ago

Lead end-to-end model evaluation.

Build competitive benchmarking pipelines.

Benchmarking Cloud Infrastructure Data Pipelines Documentation

Model Evaluation Engineer

Posted 29 days ago

Lead end-to-end model evaluation.

Develop benchmarking pipelines.

Benchmarking Cloud Infrastructure Data Pipelines Documentation

Technical Staff - Top Secret

Posted 26 days ago

Design and optimize Starshield AI integrations.

Develop software for government use.

AI/ML API Development Benchmarking Data Analysis

Global Compensation Analyst

Posted 26 days ago

Support global compensation programs and annual cycles.

Conduct compensation analysis for pay equity and benchmarking.

Benchmarking Compa-ratio modeling Compensation analysis Data analysis

Compensation Intern

Posted 25 days ago

Assist in data collection and cleaning for compensation analysis.

Support market research and external benchmarking of pay ranges.

Attention to Detail Benchmarking Communication Skills Compensation Databases

Compensation Lead Role

Posted 22 days ago

Utilize AI tools for process improvements.

Ensure job architecture supports global expansion.

AI tools Benchmarking Clear communication Data cleaning automation

Computational Protein Scientist

Posted 20 days ago

Model de novo protein generation.

Optimize protein designs with in-loop data.

AI Integration Benchmarking Biophysics Computational Biology

Artist Relations Manager

Posted 13 days ago

Liaise with product team for insights.

Manage and coach a direct report.

Artist Relations Benchmarking Content Creation Cross-Functional Collaboration

Bilingual Spanish Evaluator

Posted 20 days ago

Recruit native Spanish speakers from specified countries.

Create prompts for training AI models.

AI language models Benchmarking Bilingual writing Critical thinking

Compensation and Benefits Manager

Posted 18 days ago

Upgrade Compensation and Benefits strategy.

Manage and maintain compensation ranges.

Automation Tools Benchmarking Benefits Management Communication Skills

Performance Engineer Workload Porting

Posted 14 days ago

Port and enable benchmarks on new hardware.

Evaluate performance across various subsystems.

Benchmarking CPU/GPU Understanding Distributed Systems I/O Subsystems

Machine Learning Engineer

Posted 11 days ago

Design and maintain ML models for business solutions.

Manage the entire model lifecycle including feature engineering and tuning.

Benchmarking Feature Engineering Generative AI Large Language Models

Solutions Engineer Role

Posted 6 days ago

Own the end-to-end pre-sales process.

Collaborate with sales and technical teams as an advisor.

AI Tools Benchmarking Communication Skills Migration Planning

Senior AI Solutions Engineer

Posted 4 days ago

Assist labs in code generation and agentic SWE.

Translate technical depth into revenue through proposals.

Benchmarking Code Generation Go Java

Distributed ML Optimization Engineer

Posted 51 days ago

Optimize distributed ML performance

Accelerate deep learning inference

C++ CUDA Python Pytorch

AI Researcher - Scaling

Posted 51 days ago

Develop groundbreaking AI models, Collaborate with cross-functional teams, Stay-up-to-date with AI

ield, Drive impact on global problems, Shape company

Ai Algorithms CUDA Deep Learning Machine Learning

ML Engineer, Images

Posted 51 days ago

Evaluate image generation and identity preservation papers/models.

Develop and deploy image generation and image analysis pipelines.

AWS CUDA Google Cloud Platform Pytorch

Software Engineer - Additive Manufacturing

Posted 32 days ago

Architect and develop software for metal 3D printing.

Collaborate with automation and controls engineers.

3D Printing Systems Boost C++ CI/CD Tools

ML Platform Software Engineer

Posted 34 days ago

Develop large-scale data pipelines.

Optimize ML infrastructure for deployment.

AWS C++ CUDA Google Cloud Platform

Senior Solutions Architect II

Posted 24 days ago

Serve as a technical partner to sales teams.

Conduct architecture reviews and proofs of concepts.

AI/ML CUDA Distributed Systems Hugging Face

C++ GPU Performance Engineer

Posted 18 days ago

Build real-time instrumentation for performance monitoring.

Analyze metrics for GPU performance hotspots.

AVX C++ CUDA GPU performance optimization

Golang Software Engineer Position

Posted 7 days ago

Develop a new open-source developer experience tool.

Enhance traditional Linux workflows.

AI/ML C C++ CUDA

Mid-Level ML Engineer

Posted 35 days ago

Design and deploy ML pipelines.

Optimize models for various tasks.

Deep Learning Hugging Face APIs Machine Learning Pipelines Model Monitoring

Machine Learning Engineer

Posted 35 days ago

Build and refine ML systems for promotional performance.

Optimize production ML models for efficiency.

Apache Beam Data Pipelines Google Cloud Platform (GCP) Java

Edge Tech Lead

Posted 14 days ago

Lead edge deployment for STT/TTS technologies.

Optimize AI models for edge environments.

C/C++/Rust Programming Distillation Edge Deployment Strategy Embedded Computing

Senior Product Manager

Posted 11 days ago

Enhance user experience through research and data.

Understand and apply relevant platform technologies.

A/B Testing APIs Distributed Systems Fine-tuning

Software Engineer III, Google Cloud

Posted 54 days ago

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another.

Python

LLVM Compiler Developer

Posted 51 days ago

Develop and enhance LLVM and Clang based toolchain components

Collaborate with LLVM community for continuous integration

Back-end Bash C C++

Compiler Engineer (LLVM, C++)

Posted 51 days ago

Hiring experienced Compiler Engineers for LLVM and Clang toolchain development

Responsibilities include analyzing requirements, designing, and collaborating with the LLVM community

Back-end Bash Communication Llvm

LLVM Compiler Developer Role

Posted 51 days ago

Enhance and maintain LLVM/Clang-based toolchains

Support and optimize code for diverse platforms

Back-end Bash C++ Communication

Junior Mobile Security Tester

Posted 51 days ago

Conduct security testing on mobile and web applications

Identify and document vulnerabilities in digital banking platforms

Android Bash JavaScript Penetration Testing

Junior Web/Mobile Pen Tester

Posted 51 days ago

Ensure security of mobile and web banking applications

Identify and document vulnerabilities through penetration testing

Android JavaScript Penetration Testing Python

LLVM Compiler Engineer Role

Posted 51 days ago

Enhance and implement LLVM toolchain components

Extend open source compilers for new platforms

Back-end Bash C++ Communication

LLVM Compiler Engineer

Posted 51 days ago

Seek experienced Compiler Engineers for LLVM and Clang toolchain

Collaborate with LLVM community and contribute to public repositories

Back-end Bash Communication Llvm

LLVM Compiler Development

Posted 51 days ago

Enhance and implement components of the LLVM toolchain

Extend open source LLVM and Clang code bases

Back-end Bash Communication Llvm

LLVM Compiler Developer

Posted 51 days ago

- Enhance and implement toolchain components - Extend LLVM and Clang for client platforms -

ate with LLVM community - Analyze, build, and debug platform code - Develop and maintain target

Bash Python Version control

Senior Golang Developer Role

Posted 51 days ago

Develop cloud-based cyber protection solutions

Design and maintain high-load distributed services

Algorithms Architecture Cloud Services Data Structures

Senior Go Cloud Developer

Posted 51 days ago

Develop scalable cloud disaster recovery services

Design and implement high-load distributed systems

Algorithms Architecture Cloud Cloud Services

Remote AI Analyst

Posted 51 days ago

Drive customer experience automation

Optimize business processes with AI solutions

Ai Business Analyst Data Analysis Data Visualization

Principal AI Engineer Role

Posted 51 days ago

Hire a remote Principal AI Engineer

Develop customer experience automation solutions

Ai Automation AWS Cloud Computing