RESUME AND JOB

Deep Learning Solutions Architect – Inference Optimization

NVIDIA

Deep Learning Solutions Architect – Inference Optimization

NVIDIA

full-timePosted: Oct 14, 2025

Job Description

NVIDIA’s Worldwide Field Operations (WWFO) team is seeking a Solution Architect with a deep understanding of neural network inference. As our customers adopt increasingly complex inference pipelines on state of the artinfrastructure, there is a growing need for experts who can guide the integration of advanced inference techniques such as speculative decoding, request scheduler optimizations or FP4 quantization. The ideal candidate will be proficient using tools such as TRT LLM, vLLM, SGLang or similar, and have strong systems knowledge, enabling customers to fully use the capabilities of the new GB300 NVL72 systems (for example work on efficient KV cache offloading or help with inference of new architectures like hybrid or diffusion models, or architect the pre- and post-processing pipelines). Solutions Architects work with the most exciting computing hardware and software, driving the latest breakthroughs in artificial intelligence! We need individuals who can enable customer productivity and develop lasting relationships with our technology partners, making NVIDIA an integral part of end-user solutions. We are looking for someone always passionate about artificial intelligence, someone who can maintain understanding of a fast paced field, someone able to coordinate efforts between corporate marketing, industry business development and engineering. Solutions Architects, are the first line of technical expertise between NVIDIA and our customers. Your duties will vary from working on proof-of-concept demonstrations, to driving relationships with key executives and managers in order to promote adoption of NVIDIA based AI technology. Engaging with developers, scientific researchers, data scientists, IT managers and senior leaders is a significant part of the Solutions Architect role. What you will be doing: Work directly with key customers to understand their technology and provide the best AI solutions. Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems). This includes support in optimization of large scale inference pipelines. Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers. Enable development and growth of product features through customer feedback and proof-of-concept evaluations. What we need to see:Excellent verbal, written communication, and technical presentation skills in English. MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields. 5+ years work or research experience with Python/ C++ / other software development Work experience and knowledge of modern NLP including good understanding of transformer, state space, diffusion, MOE model architectures. This can include either expertise in training or optimization/compression/operation of DNNs. Understanding of key libraries used for NLP/LLM training (such as Megatron-LM, NeMo, DeepSpeed etc.) and/or deployment (e.g. TensorRT-LLM, vLLM, Triton Inference Server). Enthusiastic about collaborating with various teams and departments—such as Engineering, Product, Sales, and Marketing—this person thrives in dynamic environments and stays focused amid constant change. Self-starter with demeanor for growth, passion for continuous learning and sharing findings across the team. Ways to Stand Out from The Crowd: Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes. Experience working with larger transformer-based architectures for NLP, CV, ASR or other. Applied NLP technology in production environments. Proficient with DevOps tools including Docker, Kubernetes, and Singularity. Understanding of HPC systems: data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience. Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/ NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Locations

UK (Remote)

Salary

Estimated Salary Rangemedium confidence

75,000,000 - 150,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

neural network inferenceintermediate
speculative decodingintermediate
request scheduler optimizationsintermediate
FP4 quantizationintermediate
TRT LLMintermediate
vLLMintermediate
SGLangintermediate
systems knowledgeintermediate
KV cache offloadingintermediate
hybrid modelsintermediate
diffusion modelsintermediate
pre- and post-processing pipelinesintermediate
artificial intelligenceintermediate
proof-of-concept demonstrationsintermediate
in-depth analysisintermediate
optimizationintermediate
develop lasting relationshipsintermediate
coordinate effortsintermediate
engage with developersintermediate
engage with scientific researchersintermediate
engage with data scientistsintermediate
engage with IT managersintermediate
engage with senior leadersintermediate

Target Your Resume for "Deep Learning Solutions Architect – Inference Optimization" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Deep Learning Solutions Architect – Inference Optimization. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Deep Learning Solutions Architect – Inference Optimization" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

United Kingdom

Answer 10 quick questions to check your fit for Deep Learning Solutions Architect – Inference Optimization @ NVIDIA.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

Deep Learning Solutions Architect – Inference Optimization

NVIDIA

Deep Learning Solutions Architect – Inference Optimization

NVIDIA

full-timePosted: Oct 14, 2025

Job Description

Locations

UK (Remote)

Salary

Estimated Salary Rangemedium confidence

75,000,000 - 150,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

neural network inferenceintermediate
speculative decodingintermediate
request scheduler optimizationsintermediate
FP4 quantizationintermediate
TRT LLMintermediate
vLLMintermediate
SGLangintermediate
systems knowledgeintermediate
KV cache offloadingintermediate
hybrid modelsintermediate
diffusion modelsintermediate
pre- and post-processing pipelinesintermediate
artificial intelligenceintermediate
proof-of-concept demonstrationsintermediate
in-depth analysisintermediate
optimizationintermediate
develop lasting relationshipsintermediate
coordinate effortsintermediate
engage with developersintermediate
engage with scientific researchersintermediate
engage with data scientistsintermediate
engage with IT managersintermediate
engage with senior leadersintermediate

Target Your Resume for "Deep Learning Solutions Architect – Inference Optimization" , NVIDIA

Get personalized recommendations to optimize your resume specifically for Deep Learning Solutions Architect – Inference Optimization. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Deep Learning Solutions Architect – Inference Optimization" , NVIDIA

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

United Kingdom

Answer 10 quick questions to check your fit for Deep Learning Solutions Architect – Inference Optimization @ NVIDIA.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap