Shape the Future of AI Accelerators at AWS NeuronJoin the elite team behind AWS Neuron—the software stack powering AWS's next-generation AI accelerators Inferentia and Trainium. As a Senior Software Engineer in our Machine Learning Applications team, you'll be at the forefront of deploying and optimizing some of the world's most sophisticated AI models at unprecedented scale.What You'll Impact:• Pioneer distributed inference solutions for industry-leading LLMs such as GPT, Llama, Qwen• Optimize breakthrough language and vision generative AI models• Collaborate directly with silicon architects and compiler teams to push the boundaries of AI acceleration• Drive performance benchmarking and tuning that directly impacts millions of inference calls globallyKey job responsibilitiesYou will drive the Evolution of Distributed AI at AWS NeuronYou'll develop the bridge between ML frameworks including PyTorch, JAX and AI hardware. This isn't just about just optimization—it's about revolutionizing how AI models run at scale.Technical Impact You'll Drive:• Spearhead distributed inference architecture for PyTorch and JAX using XLA• Engineer breakthrough performance optimizations for AWS Trainium and Inferentia• Develop ML tools to enhance LLM accuracy and efficiency• Transform complex tensor operations into highly optimized hardware implementations• Pioneer benchmarking methodologies that shape next-gen AI accelerator designWhat Makes This Role Unique:• Direct influence on AWS's AI infrastructure used by thousands of ML applications• Full-stack optimization from high-level frameworks to hardware-specific primitives• Creation of tools and frameworks that define industry standards for ML deployment• Collaboration with both open-source ML communities and hardware architecture teamsYour Technical Arsenal Should Include:• Deep expertise in Python and ML framework internals• Strong understanding of distributed systems and ML optimization• Passion for performance tuning and system architectureA day in the lifeWork/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.About the teamAt AWS Neuron, we're revolutionizing how the world's most sophisticated AI models run at scale through Amazon's next-generation AI accelerators. Operating at the unique intersection of ML frameworks and custom silicon, our team drives innovation from silicon architecture to production software deployment.We pioneer distributed inference solutions for PyTorch and JAX using XLA, optimize industry-leading LLMs like GPT and Llama, and collaborate directly with silicon architects to influence the future of AI hardware. Our systems handle millions of inference calls daily, while our optimizations directly impact thousands of AWS customers running critical AI workloads.We're focused on pushing the boundaries of large language model optimization, distributed inference architecture, and hardware-specific performance tuning. Our deep technical experts transform complex ML challenges into elegant, scalable solutions that define how AI workloads run in production.
Locations
United States, WA, Seattle, Seattle, WA, United States
Salary
Salary not disclosed
Estimated Salary Rangehigh confidence
180,000 - 300,000 USD / yearly
Source: ai estimated
* This is an estimated range based on market data and may vary based on experience and qualifications.
Skills Required
- 3+ years of computer science fundamentals (object-oriented design, data structures, algorithm design, problem solving and complexity analysis) experienceintermediate
- 3+ years of programming experience using Python or C++ and PyTorch.intermediate
- Experience with AI acceleration via quantization, parallelism, model compression, batching, KV caching, vllm servingintermediate
- Experience with accuracy debugging & tooling, performance benchmarking of AI acceleratorsintermediate
- Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on optimizations for improving the model execution.intermediate
Required Qualifications
- 3+ years of computer science fundamentals (object-oriented design, data structures, algorithm design, problem solving and complexity analysis) experience (experience, 3 years)
- 3+ years of programming experience using Python or C++ and PyTorch. (experience, 3 years)
- Experience with AI acceleration via quantization, parallelism, model compression, batching, KV caching, vllm serving (experience)
- Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators (experience)
- Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on optimizations for improving the model execution. (experience)
Preferred Qualifications
- Bachelor's degree in computer science or equivalent (degree in computer science or equivalent)
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $129,300/year in our lowest geographic market up to $223,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site. (experience)
Responsibilities
You will drive the Evolution of Distributed AI at AWS Neuron
You'll develop the bridge between ML frameworks including PyTorch, JAX and AI hardware. This isn't just about just optimization—it's about revolutionizing how AI models run at scale.
Technical Impact You'll Drive:
• Spearhead distributed inference architecture for PyTorch and JAX using XLA
• Engineer breakthrough performance optimizations for AWS Trainium and Inferentia
• Develop ML tools to enhance LLM accuracy and efficiency
• Transform complex tensor operations into highly optimized hardware implementations
• Pioneer benchmarking methodologies that shape next-gen AI accelerator design
What Makes This Role Unique:
• Direct influence on AWS's AI infrastructure used by thousands of ML applications
• Full-stack optimization from high-level frameworks to hardware-specific primitives
• Creation of tools and frameworks that define industry standards for ML deployment
• Collaboration with both open-source ML communities and hardware architecture teams
Your Technical Arsenal Should Include:
• Deep expertise in Python and ML framework internals
• Strong understanding of distributed systems and ML optimization
• Passion for performance tuning and system architecture
Target Your Resume for "Software Engineer - AI/ML, AWS Neuron Apps"
Get personalized recommendations to optimize your resume specifically for Software Engineer - AI/ML, AWS Neuron Apps. Our AI analyzes job requirements and tailors your resume to maximize your chances.
Keyword optimization
Skills matching
Experience alignment
Check Your ATS Score for "Software Engineer - AI/ML, AWS Neuron Apps"
Find out how well your resume matches this job's requirements. Our Applicant Tracking System (ATS) analyzer scores your resume based on keywords, skills, and format compatibility.