Resume and JobRESUME AND JOB
NVIDIA logo

Senior Deep Learning Inference Performance Architect

NVIDIA

Engineering Jobs

Senior Deep Learning Inference Performance Architect

full-timePosted: Oct 28, 2025

Job Description

We are now looking for a Senior Deep Learning Inference Performance Architect!NVIDIA is seeking a Senior Performance Architect - a creative engineer who loves to squeeze out every cycle of performance from deep learning software. The Inference Architecture team does groundbreaking hardware-software co-design work that focuses on accelerating AI Inference workloads. In this role, you will write performance optimized low level code on today’s GPUs, evaluate and improve state-of-the-art performance techniques in production Large Language Model deployments, and help guide our future GPU architecture decisions. If you are someone who enjoys digging deep into GPU architecture details, are passionate about AI, and know where every cycle goes when you write highly tuned software, this role may be a great fit for you. What you’ll be doing:Develop innovative GPU and system architectures to extend the state of the art in AI Inference performance and efficiencyModel, analyze and prototype key deep learning algorithms and applicationsUnderstand and analyze the interplay of hardware and software architectures on future algorithms and applicationsWrite efficient software for AI Inference, including CUDA kernels, framework level code, and application level codeCollaborate across the company to guide the direction of AI, working with software, research and product teamsWhat we need to see:A MS or PhD in a relevant discipline (CS, EE, Math) or equivalent experience, with 5+ years or relevant experienceStrong mathematical foundation in machine learning and deep learningExpert programming skills in C, C++, and PythonFamiliarity with GPU computing (CUDA or similar) and HPC (MPI, OpenMP)Strong knowledge and coursework in computer architectureWays to stand out from the crowd:Background with systems-level performance modeling, profiling, and analysisExperience in characterizing and modeling system-level performance, executing comparison studies, and documenting and publishing resultsExperience in optimizing AI Inference workloads with CUDA kernel developmentNVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard working people in the world working for us. If you're creative, autonomous, and love a challenge, consider joining our Inference Performance Architecture team and help us build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until November 1, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Durham, NC, US

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 45,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Deep Learningintermediate
  • AI Inferenceintermediate
  • GPU Architectureintermediate
  • CUDAintermediate
  • C++intermediate
  • Pythonintermediate
  • Cintermediate
  • Machine Learningintermediate
  • HPCintermediate
  • MPIintermediate
  • OpenMPintermediate
  • Computer Architectureintermediate
  • Systems-level Performance Modelingintermediate
  • Profilingintermediate
  • Performance Analysisintermediate
  • CUDA Kernelsintermediate
  • Framework Level Codeintermediate
  • Application Level Codeintermediate
  • Mathematical Foundation in Machine Learningintermediate
  • Mathematical Foundation in Deep Learningintermediate
  • Hardware-Software Co-Designintermediate

Target Your Resume for "Senior Deep Learning Inference Performance Architect" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Senior Deep Learning Inference Performance Architect. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Deep Learning Inference Performance Architect" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for Senior Deep Learning Inference Performance Architect @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

NVIDIA logo

Senior Deep Learning Inference Performance Architect

NVIDIA

Engineering Jobs

Senior Deep Learning Inference Performance Architect

full-timePosted: Oct 28, 2025

Job Description

We are now looking for a Senior Deep Learning Inference Performance Architect!NVIDIA is seeking a Senior Performance Architect - a creative engineer who loves to squeeze out every cycle of performance from deep learning software. The Inference Architecture team does groundbreaking hardware-software co-design work that focuses on accelerating AI Inference workloads. In this role, you will write performance optimized low level code on today’s GPUs, evaluate and improve state-of-the-art performance techniques in production Large Language Model deployments, and help guide our future GPU architecture decisions. If you are someone who enjoys digging deep into GPU architecture details, are passionate about AI, and know where every cycle goes when you write highly tuned software, this role may be a great fit for you. What you’ll be doing:Develop innovative GPU and system architectures to extend the state of the art in AI Inference performance and efficiencyModel, analyze and prototype key deep learning algorithms and applicationsUnderstand and analyze the interplay of hardware and software architectures on future algorithms and applicationsWrite efficient software for AI Inference, including CUDA kernels, framework level code, and application level codeCollaborate across the company to guide the direction of AI, working with software, research and product teamsWhat we need to see:A MS or PhD in a relevant discipline (CS, EE, Math) or equivalent experience, with 5+ years or relevant experienceStrong mathematical foundation in machine learning and deep learningExpert programming skills in C, C++, and PythonFamiliarity with GPU computing (CUDA or similar) and HPC (MPI, OpenMP)Strong knowledge and coursework in computer architectureWays to stand out from the crowd:Background with systems-level performance modeling, profiling, and analysisExperience in characterizing and modeling system-level performance, executing comparison studies, and documenting and publishing resultsExperience in optimizing AI Inference workloads with CUDA kernel developmentNVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard working people in the world working for us. If you're creative, autonomous, and love a challenge, consider joining our Inference Performance Architecture team and help us build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until November 1, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Durham, NC, US

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 45,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Deep Learningintermediate
  • AI Inferenceintermediate
  • GPU Architectureintermediate
  • CUDAintermediate
  • C++intermediate
  • Pythonintermediate
  • Cintermediate
  • Machine Learningintermediate
  • HPCintermediate
  • MPIintermediate
  • OpenMPintermediate
  • Computer Architectureintermediate
  • Systems-level Performance Modelingintermediate
  • Profilingintermediate
  • Performance Analysisintermediate
  • CUDA Kernelsintermediate
  • Framework Level Codeintermediate
  • Application Level Codeintermediate
  • Mathematical Foundation in Machine Learningintermediate
  • Mathematical Foundation in Deep Learningintermediate
  • Hardware-Software Co-Designintermediate

Target Your Resume for "Senior Deep Learning Inference Performance Architect" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Senior Deep Learning Inference Performance Architect. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Deep Learning Inference Performance Architect" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for Senior Deep Learning Inference Performance Architect @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.