Resume and JobRESUME AND JOB
NVIDIA logo

DL Performance Software Engineer - LLM Inference

NVIDIA

Software and Technology Jobs

DL Performance Software Engineer - LLM Inference

full-timePosted: Oct 24, 2025

Job Description

At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking technologies that enable anyone to harness the power of AI and benefit from its potential. Our team consists of experts in AI, systems and performance optimization. Our leadership includes world-renowned experts in AI systems who have received multiple academic and industry research awards.As a member of the LLM inference team you will help build innovative software with the goals of enabling LLM inference to be more efficient, scalable, and accessible. Are you interested in architecting and implementing the best inference stacks in the LLM world? Work and collaborate with a diverse set of teams involving resource orchestration, distributed systems, inference engine optimization, and writing high performance GPU kernels. Come join our team and contribute towards pioneering accelerated computing and AI.What you’ll be doing:Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.Perform benchmarking, profiling, and system-level programming for GPU applications.Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.Conduct unit tests and performance tests for different stages of the inference pipeline.What we need to see:Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent experience.Strong coding skills in Python and C/C++.2+ years of industry experience in software engineering or equivalent research experience.Knowledgeable and passionate about machine learning and performance engineering.Proven project experiences in building software where performance is one of its core offerings.Ways to stand out from the crowd:Solid fundamentals in machine learning, deep learning, operating systems, computer architecture and parallel programming.Research experience in systems or machine learning.Project experience in modern DL software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.Experience with performance modelling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU.We strongly encourage you to include sample projects (e.g. Github) that demonstrate the qualifications above. NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. We are building many of the most important AI technologies and infrastructure around the world. Are you passionate about AI systems, efficiency, and performance? Join us to push the frontier of accelerated computing together!Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until October 28, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Santa Clara, CA, US

Salary

Estimated Salary Rangemedium confidence

18,000,000 - 36,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • C++intermediate
  • Pythonintermediate
  • AIintermediate
  • Machine Learningintermediate
  • Deep Learningintermediate
  • Systems Optimizationintermediate
  • Performance Optimizationintermediate
  • Resource Orchestrationintermediate
  • Distributed Systemsintermediate
  • Inference Engine Optimizationintermediate
  • GPU Kernelsintermediate
  • Accelerated Computingintermediate
  • Benchmarkingintermediate
  • Profilingintermediate
  • System-Level Programmingintermediate
  • GPU Applicationsintermediate
  • Code Reviewsintermediate
  • Design Documentationintermediate
  • Tutorialsintermediate
  • Unit Testingintermediate
  • Performance Testingintermediate
  • Inference Pipelineintermediate
  • Software Engineeringintermediate
  • Operating Systemsintermediate
  • Computer Architectureintermediate
  • Parallel Programmingintermediate

Target Your Resume for "DL Performance Software Engineer - LLM Inference" , NVIDIA

Get personalized recommendations to optimize your resume specifically for DL Performance Software Engineer - LLM Inference. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "DL Performance Software Engineer - LLM Inference" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for DL Performance Software Engineer - LLM Inference @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

NVIDIA logo

DL Performance Software Engineer - LLM Inference

NVIDIA

Software and Technology Jobs

DL Performance Software Engineer - LLM Inference

full-timePosted: Oct 24, 2025

Job Description

At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking technologies that enable anyone to harness the power of AI and benefit from its potential. Our team consists of experts in AI, systems and performance optimization. Our leadership includes world-renowned experts in AI systems who have received multiple academic and industry research awards.As a member of the LLM inference team you will help build innovative software with the goals of enabling LLM inference to be more efficient, scalable, and accessible. Are you interested in architecting and implementing the best inference stacks in the LLM world? Work and collaborate with a diverse set of teams involving resource orchestration, distributed systems, inference engine optimization, and writing high performance GPU kernels. Come join our team and contribute towards pioneering accelerated computing and AI.What you’ll be doing:Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.Perform benchmarking, profiling, and system-level programming for GPU applications.Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.Conduct unit tests and performance tests for different stages of the inference pipeline.What we need to see:Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent experience.Strong coding skills in Python and C/C++.2+ years of industry experience in software engineering or equivalent research experience.Knowledgeable and passionate about machine learning and performance engineering.Proven project experiences in building software where performance is one of its core offerings.Ways to stand out from the crowd:Solid fundamentals in machine learning, deep learning, operating systems, computer architecture and parallel programming.Research experience in systems or machine learning.Project experience in modern DL software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.Experience with performance modelling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU.We strongly encourage you to include sample projects (e.g. Github) that demonstrate the qualifications above. NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. We are building many of the most important AI technologies and infrastructure around the world. Are you passionate about AI systems, efficiency, and performance? Join us to push the frontier of accelerated computing together!Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until October 28, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

  • Santa Clara, CA, US

Salary

Estimated Salary Rangemedium confidence

18,000,000 - 36,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • C++intermediate
  • Pythonintermediate
  • AIintermediate
  • Machine Learningintermediate
  • Deep Learningintermediate
  • Systems Optimizationintermediate
  • Performance Optimizationintermediate
  • Resource Orchestrationintermediate
  • Distributed Systemsintermediate
  • Inference Engine Optimizationintermediate
  • GPU Kernelsintermediate
  • Accelerated Computingintermediate
  • Benchmarkingintermediate
  • Profilingintermediate
  • System-Level Programmingintermediate
  • GPU Applicationsintermediate
  • Code Reviewsintermediate
  • Design Documentationintermediate
  • Tutorialsintermediate
  • Unit Testingintermediate
  • Performance Testingintermediate
  • Inference Pipelineintermediate
  • Software Engineeringintermediate
  • Operating Systemsintermediate
  • Computer Architectureintermediate
  • Parallel Programmingintermediate

Target Your Resume for "DL Performance Software Engineer - LLM Inference" , NVIDIA

Get personalized recommendations to optimize your resume specifically for DL Performance Software Engineer - LLM Inference. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "DL Performance Software Engineer - LLM Inference" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

United States of America

Answer 10 quick questions to check your fit for DL Performance Software Engineer - LLM Inference @ NVIDIA.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.