Resume and JobRESUME AND JOB
NVIDIA logo

Senior GPU Kernel Performance Lead

NVIDIA

Software and Technology Jobs

Senior GPU Kernel Performance Lead

full-timePosted: Jul 25, 2025

Job Description

We're now looking for a Senior GPU Kernel Performance Lead. Do you enjoy analyzing and reporting on GPU kernel performance? If so, consider applying for the role of Senior GPU Kernel Performance Analysis Lead! Our team delivers high-performance GPU math kernels to NVIDIA’s cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models. The team is proud to play an integral part in enabling breakthroughs in domains such as image classification, speech recognition, natural language processing,and large language models. We’re always striving for peak performance and energy efficiency on current and future-generation GPUs.As a kernel performance analysis lead, you will oversee all efforts pertaining to the performance of our kernels. Join the team that is building the underlying software used across the world to power the revolution in artificial intelligence! To get a sense of the code we write, check out our CUTLASS open-source project showcasing performant matrix multiply on NVIDIA’s Tensor Cores with CUDA. While there will be the opportunity for hands-on development, this position specifically is to lead a team for validating the performance of the kernels.What you’ll be doing:Specify test cases, derived from Deep Learning workloads, to provide adequate directed and use-case coverage across all kernels on both simulation and silicon targetsDetermine performance theory through the development and use of analytical modelsTrack and report on kernel performance throughout the development lifecycle by using and expanding upon current infrastructureProvide feedback to the kernel developers by identifying performance regressions and opportunities to reach the achievable peak performanceWhat we need to see:PhD degree in Computer Science, Computer Engineering, Applied Math, or related field (or equivalent experience) with 8+ years of relevant industry experience.Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test designExperience leading or managing a team relating to the performance of CPUs, GPUs, or other DL acceleratorsWays to stand out from the crowd:Experience with analytical models and cycle-accurate HW simulatorsKnowledgeable about performance tools like Nsight or VTuneProgramming experience beyond C++ including assembly, MLIR/LLVM, Python, and CUDA/OpenCLNVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and collaborative software leader seeking new challenges? If so, we want to hear from you! Come, join our DL Architecture team and help build the real-time, cost-effective AI computing platform driving our success in this exciting and quickly growing field.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 5, and 272,000 USD - 425,500 USD for Level 6.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until July 29, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Santa Clara, CA, US

Salary

Estimated Salary Rangemedium confidence

21,000,000 - 35,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU kernel performance analysisintermediate
  • C++ programmingintermediate
  • software designintermediate
  • debuggingintermediate
  • performance analysisintermediate
  • analytical modelingintermediate
  • test case specificationintermediate
  • performance tracking and reportingintermediate
  • CUDAintermediate
  • CUTLASSintermediate
  • cuDNNintermediate
  • cuBLASintermediate
  • TensorRTintermediate
  • Tensor Coresintermediate
  • deep learning workloadsintermediate
  • simulation and silicon targetsintermediate
  • performance regression identificationintermediate

Target Your Resume for "Senior GPU Kernel Performance Lead" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Senior GPU Kernel Performance Lead. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior GPU Kernel Performance Lead" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for Senior GPU Kernel Performance Lead @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

NVIDIA logo

Senior GPU Kernel Performance Lead

NVIDIA

Software and Technology Jobs

Senior GPU Kernel Performance Lead

full-timePosted: Jul 25, 2025

Job Description

We're now looking for a Senior GPU Kernel Performance Lead. Do you enjoy analyzing and reporting on GPU kernel performance? If so, consider applying for the role of Senior GPU Kernel Performance Analysis Lead! Our team delivers high-performance GPU math kernels to NVIDIA’s cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models. The team is proud to play an integral part in enabling breakthroughs in domains such as image classification, speech recognition, natural language processing,and large language models. We’re always striving for peak performance and energy efficiency on current and future-generation GPUs.As a kernel performance analysis lead, you will oversee all efforts pertaining to the performance of our kernels. Join the team that is building the underlying software used across the world to power the revolution in artificial intelligence! To get a sense of the code we write, check out our CUTLASS open-source project showcasing performant matrix multiply on NVIDIA’s Tensor Cores with CUDA. While there will be the opportunity for hands-on development, this position specifically is to lead a team for validating the performance of the kernels.What you’ll be doing:Specify test cases, derived from Deep Learning workloads, to provide adequate directed and use-case coverage across all kernels on both simulation and silicon targetsDetermine performance theory through the development and use of analytical modelsTrack and report on kernel performance throughout the development lifecycle by using and expanding upon current infrastructureProvide feedback to the kernel developers by identifying performance regressions and opportunities to reach the achievable peak performanceWhat we need to see:PhD degree in Computer Science, Computer Engineering, Applied Math, or related field (or equivalent experience) with 8+ years of relevant industry experience.Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test designExperience leading or managing a team relating to the performance of CPUs, GPUs, or other DL acceleratorsWays to stand out from the crowd:Experience with analytical models and cycle-accurate HW simulatorsKnowledgeable about performance tools like Nsight or VTuneProgramming experience beyond C++ including assembly, MLIR/LLVM, Python, and CUDA/OpenCLNVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and collaborative software leader seeking new challenges? If so, we want to hear from you! Come, join our DL Architecture team and help build the real-time, cost-effective AI computing platform driving our success in this exciting and quickly growing field.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 5, and 272,000 USD - 425,500 USD for Level 6.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until July 29, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Santa Clara, CA, US

Salary

Estimated Salary Rangemedium confidence

21,000,000 - 35,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU kernel performance analysisintermediate
  • C++ programmingintermediate
  • software designintermediate
  • debuggingintermediate
  • performance analysisintermediate
  • analytical modelingintermediate
  • test case specificationintermediate
  • performance tracking and reportingintermediate
  • CUDAintermediate
  • CUTLASSintermediate
  • cuDNNintermediate
  • cuBLASintermediate
  • TensorRTintermediate
  • Tensor Coresintermediate
  • deep learning workloadsintermediate
  • simulation and silicon targetsintermediate
  • performance regression identificationintermediate

Target Your Resume for "Senior GPU Kernel Performance Lead" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Senior GPU Kernel Performance Lead. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior GPU Kernel Performance Lead" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for Senior GPU Kernel Performance Lead @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.