Resume and JobRESUME AND JOB
Affirm logo

Staff Software Engineer - SRE, Backend (Reliability Engineering)

Affirm

Staff Software Engineer - SRE, Backend (Reliability Engineering)

full-timePosted: Jan 10, 2026

Job Description

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.

 

Site Reliability Engineering at Affirm is a small, yet crucial, team that helps our Engineering partners to “Operate What They Own” with excellence to protect their customers’ experience. SRE accomplishes this through defining frameworks and best practices for operating applications, building tooling, and providing training and consulting. Some of the many SRE responsibilities are:

  • Providing data and visibility to teams and leadership on application performance

  • Guiding the development of SLOs

  • Driving the Incident Management and Analysis process

  • Steering the implementation of Change Management and Deployment practices

  • Engaging in service and architectural conversations

  • Recommending observability and alerting configurations

 

The SRE team benefits from experience across many domains including:

  • infrastructure, platform, and distributed systems

  • capacity management, load and chaos testing

  • automation, observability, and configuration management

  • development and product experience

 

The SRE team is seeking seasoned and motivated software and systems engineers with the experience to build, iterate on, and expand  incident lifecycle, reliability, and resilience practices throughout Affirms Engineering organization and beyond.

 

What You'll Do

  • You will be responsible for setting technical strategy vision for your team on a multi year-long time scale, and help your team tie it together with critical, business-impacting projects.

  • You will collaborate across teams in the product development lifecycle by collaborating with infrastructure, product management, developer experience & analytics to ensure technical sustainability, risks and trade-offs are well understood and managed.

  • You will act as a force-multiplier for your team through your definition and advocacy of technical solutions and operational processes

  • You take ownership of your team’s operations and availability by ensuring you have the right monitoring, triage rotations, playbooks, policies, testing and alerting in place to support “keep the lights on” & on-call efforts.

  • You will foster a culture of quality and ownership on your team by setting code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks.

  • You will help develop talent on your team by providing feedback and guidance, and leading by example.

 

What We Look For

  • You have 8+ years of experience designing, developing, advocating as a point subject of reference, and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin. 

  • You have an extensive track record of developing highly available distributed systems using technologies like AWS, MySQL, Spark and Kubernetes.

  • You have track record of managing, driving and improving the Incident Livecycle process from live incident management through retrospective and post-incident analysis to provide actional insights to enhance overall system reliability, resilience, and performance

  • You have 7+ years experience in Site Reliability or Production Engineering teams.

  • You demonstrate curiosity with empathy, and strong opinions loosely held.

  • You have experience delivering major features, system components or deprecating existing functionality in a system through the definition of a technical and execution plan. You write high quality code that is easily understood and used by others.

  • You thrive in ambiguity, and are comfortable moving from low level language idioms all the way to the architecture of large systems to understand how they work.

  • Your growth and impact trajectory demonstrates that you have mastered gathering and iterating on feedback from your engineering and cross-functional peers.

  • You have strong verbal and written communication skills that support effective collaboration with our global engineering team and key stakeholders of an organization.

  • This position requires either equivalent practical experience or a Bachelor’s degree in a related field. 

 

Base Pay Grade - P

Equity Grade - 13

Employees new to Affirm typically come in at the start of the pay range. Affirm focuses on providing a simple and transparent pay structure which is based on a variety of factors, including location, experience and job-related skills.

Base pay is part of a total compensation package that may include equity rewards, monthly stipends for health, wellness and tech spending, and benefits (including 100% subsidized medical coverage, dental and vision for you and your dependents.)

USA base pay range (CA, WA, NY, NJ, CT) per year: $225,000 - $275,000

USA base pay range (all other U.S. states) per year: $200,000 - $250,000

Location: Remote - US

#LI-Remote

 

Affirm is proud to be a remote-first company! The majority of our roles are remote and you can work almost anywhere within the country of employment. Affirmers in proximal roles have the flexibility to work remotely, but will occasionally be required to work out of their assigned Affirm office. A limited number of roles remain office-based due to the nature of their job responsibilities.

We’re extremely proud to offer competitive benefits that are anchored to our core value of people come first. Some key highlights of our benefits package include: 

  • Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents 
  • Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
  • ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount

We believe It’s On Us to provide an inclusive interview experience for all, including people with disabilities. We are happy to provide reasonable accommodations to candidates in need of individualized support during the hiring process.

[For U.S. positions that could be performed in Los Angeles or San Francisco] Pursuant to the San Francisco Fair Chance Ordinance and Los Angeles Fair Chance Initiative for Hiring Ordinance, Affirm will consider for employment qualified applicants with arrest and conviction records.

By clicking "Submit Application," you acknowledge that you have read Affirm's Global Candidate Privacy Notice and hereby freely and unambiguously give informed consent to the collection, processing, use, and storage of your personal information as described therein.

Locations

  • Remote US, (Remote)

Salary

200,000 - 275,000 USD / yearly

Skills Required

  • Bash, Python or Kotlinintermediate
  • AWS, MySQL, Spark, Kubernetesintermediate
  • Infrastructure, platform, and distributed systemsintermediate
  • Capacity management, load and chaos testingintermediate
  • Automation, observability, and configuration managementintermediate
  • Development and product experienceintermediate
  • Incident management and analysisintermediate
  • Change management and deployment practicesintermediate
  • SLO developmentintermediate
  • Monitoring, triage rotations, playbooks, policies, testing and alertingintermediate
  • Code review and design standardsintermediate
  • Strong verbal and written communicationintermediate

Required Qualifications

  • 8+ years of experience designing, developing, advocating as a point subject of reference, and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin (experience)
  • Extensive track record of developing highly available distributed systems using technologies like AWS, MySQL, Spark and Kubernetes (experience)
  • Track record of managing, driving and improving the Incident Lifecycle process from live incident management through retrospective and post-incident analysis to provide actionable insights to enhance overall system reliability, resilience, and performance (experience)
  • 7+ years experience in Site Reliability or Production Engineering teams (experience)
  • Demonstrate curiosity with empathy, and strong opinions loosely held (experience)
  • Experience delivering major features, system components or deprecating existing functionality in a system through the definition of a technical and execution plan. Write high quality code that is easily understood and used by others (experience)
  • Thrive in ambiguity, and are comfortable moving from low level language idioms all the way to the architecture of large systems to understand how they work (experience)
  • Growth and impact trajectory demonstrates that you have mastered gathering and iterating on feedback from your engineering and cross-functional peers (experience)
  • Strong verbal and written communication skills that support effective collaboration with our global engineering team and key stakeholders of an organization (experience)
  • Equivalent practical experience or a Bachelor’s degree in a related field (experience)

Responsibilities

  • Set technical strategy vision for your team on a multi year-long time scale, and help your team tie it together with critical, business-impacting projects
  • Collaborate across teams in the product development lifecycle by collaborating with infrastructure, product management, developer experience & analytics to ensure technical sustainability, risks and trade-offs are well understood and managed
  • Act as a force-multiplier for your team through your definition and advocacy of technical solutions and operational processes
  • Take ownership of your team’s operations and availability by ensuring you have the right monitoring, triage rotations, playbooks, policies, testing and alerting in place to support “keep the lights on” & on-call efforts
  • Foster a culture of quality and ownership on your team by setting code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks
  • Help develop talent on your team by providing feedback and guidance, and leading by example

Benefits

  • general: Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
  • general: Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • general: Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
  • general: ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount

Target Your Resume for "Staff Software Engineer - SRE, Backend (Reliability Engineering)" , Affirm

Get personalized recommendations to optimize your resume specifically for Staff Software Engineer - SRE, Backend (Reliability Engineering). Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Staff Software Engineer - SRE, Backend (Reliability Engineering)" , Affirm

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Infrastructure Platform EngInfrastructure Platform Eng
Quiz Challenge

Answer 10 quick questions to check your fit for Staff Software Engineer - SRE, Backend (Reliability Engineering) @ Affirm.

10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Affirm logo

Staff Software Engineer - SRE, Backend (Reliability Engineering)

Affirm

Staff Software Engineer - SRE, Backend (Reliability Engineering)

full-timePosted: Jan 10, 2026

Job Description

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.

 

Site Reliability Engineering at Affirm is a small, yet crucial, team that helps our Engineering partners to “Operate What They Own” with excellence to protect their customers’ experience. SRE accomplishes this through defining frameworks and best practices for operating applications, building tooling, and providing training and consulting. Some of the many SRE responsibilities are:

  • Providing data and visibility to teams and leadership on application performance

  • Guiding the development of SLOs

  • Driving the Incident Management and Analysis process

  • Steering the implementation of Change Management and Deployment practices

  • Engaging in service and architectural conversations

  • Recommending observability and alerting configurations

 

The SRE team benefits from experience across many domains including:

  • infrastructure, platform, and distributed systems

  • capacity management, load and chaos testing

  • automation, observability, and configuration management

  • development and product experience

 

The SRE team is seeking seasoned and motivated software and systems engineers with the experience to build, iterate on, and expand  incident lifecycle, reliability, and resilience practices throughout Affirms Engineering organization and beyond.

 

What You'll Do

  • You will be responsible for setting technical strategy vision for your team on a multi year-long time scale, and help your team tie it together with critical, business-impacting projects.

  • You will collaborate across teams in the product development lifecycle by collaborating with infrastructure, product management, developer experience & analytics to ensure technical sustainability, risks and trade-offs are well understood and managed.

  • You will act as a force-multiplier for your team through your definition and advocacy of technical solutions and operational processes

  • You take ownership of your team’s operations and availability by ensuring you have the right monitoring, triage rotations, playbooks, policies, testing and alerting in place to support “keep the lights on” & on-call efforts.

  • You will foster a culture of quality and ownership on your team by setting code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks.

  • You will help develop talent on your team by providing feedback and guidance, and leading by example.

 

What We Look For

  • You have 8+ years of experience designing, developing, advocating as a point subject of reference, and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin. 

  • You have an extensive track record of developing highly available distributed systems using technologies like AWS, MySQL, Spark and Kubernetes.

  • You have track record of managing, driving and improving the Incident Livecycle process from live incident management through retrospective and post-incident analysis to provide actional insights to enhance overall system reliability, resilience, and performance

  • You have 7+ years experience in Site Reliability or Production Engineering teams.

  • You demonstrate curiosity with empathy, and strong opinions loosely held.

  • You have experience delivering major features, system components or deprecating existing functionality in a system through the definition of a technical and execution plan. You write high quality code that is easily understood and used by others.

  • You thrive in ambiguity, and are comfortable moving from low level language idioms all the way to the architecture of large systems to understand how they work.

  • Your growth and impact trajectory demonstrates that you have mastered gathering and iterating on feedback from your engineering and cross-functional peers.

  • You have strong verbal and written communication skills that support effective collaboration with our global engineering team and key stakeholders of an organization.

  • This position requires either equivalent practical experience or a Bachelor’s degree in a related field. 

 

Base Pay Grade - P

Equity Grade - 13

Employees new to Affirm typically come in at the start of the pay range. Affirm focuses on providing a simple and transparent pay structure which is based on a variety of factors, including location, experience and job-related skills.

Base pay is part of a total compensation package that may include equity rewards, monthly stipends for health, wellness and tech spending, and benefits (including 100% subsidized medical coverage, dental and vision for you and your dependents.)

USA base pay range (CA, WA, NY, NJ, CT) per year: $225,000 - $275,000

USA base pay range (all other U.S. states) per year: $200,000 - $250,000

Location: Remote - US

#LI-Remote

 

Affirm is proud to be a remote-first company! The majority of our roles are remote and you can work almost anywhere within the country of employment. Affirmers in proximal roles have the flexibility to work remotely, but will occasionally be required to work out of their assigned Affirm office. A limited number of roles remain office-based due to the nature of their job responsibilities.

We’re extremely proud to offer competitive benefits that are anchored to our core value of people come first. Some key highlights of our benefits package include: 

  • Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents 
  • Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
  • ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount

We believe It’s On Us to provide an inclusive interview experience for all, including people with disabilities. We are happy to provide reasonable accommodations to candidates in need of individualized support during the hiring process.

[For U.S. positions that could be performed in Los Angeles or San Francisco] Pursuant to the San Francisco Fair Chance Ordinance and Los Angeles Fair Chance Initiative for Hiring Ordinance, Affirm will consider for employment qualified applicants with arrest and conviction records.

By clicking "Submit Application," you acknowledge that you have read Affirm's Global Candidate Privacy Notice and hereby freely and unambiguously give informed consent to the collection, processing, use, and storage of your personal information as described therein.

Locations

  • Remote US, (Remote)

Salary

200,000 - 275,000 USD / yearly

Skills Required

  • Bash, Python or Kotlinintermediate
  • AWS, MySQL, Spark, Kubernetesintermediate
  • Infrastructure, platform, and distributed systemsintermediate
  • Capacity management, load and chaos testingintermediate
  • Automation, observability, and configuration managementintermediate
  • Development and product experienceintermediate
  • Incident management and analysisintermediate
  • Change management and deployment practicesintermediate
  • SLO developmentintermediate
  • Monitoring, triage rotations, playbooks, policies, testing and alertingintermediate
  • Code review and design standardsintermediate
  • Strong verbal and written communicationintermediate

Required Qualifications

  • 8+ years of experience designing, developing, advocating as a point subject of reference, and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin (experience)
  • Extensive track record of developing highly available distributed systems using technologies like AWS, MySQL, Spark and Kubernetes (experience)
  • Track record of managing, driving and improving the Incident Lifecycle process from live incident management through retrospective and post-incident analysis to provide actionable insights to enhance overall system reliability, resilience, and performance (experience)
  • 7+ years experience in Site Reliability or Production Engineering teams (experience)
  • Demonstrate curiosity with empathy, and strong opinions loosely held (experience)
  • Experience delivering major features, system components or deprecating existing functionality in a system through the definition of a technical and execution plan. Write high quality code that is easily understood and used by others (experience)
  • Thrive in ambiguity, and are comfortable moving from low level language idioms all the way to the architecture of large systems to understand how they work (experience)
  • Growth and impact trajectory demonstrates that you have mastered gathering and iterating on feedback from your engineering and cross-functional peers (experience)
  • Strong verbal and written communication skills that support effective collaboration with our global engineering team and key stakeholders of an organization (experience)
  • Equivalent practical experience or a Bachelor’s degree in a related field (experience)

Responsibilities

  • Set technical strategy vision for your team on a multi year-long time scale, and help your team tie it together with critical, business-impacting projects
  • Collaborate across teams in the product development lifecycle by collaborating with infrastructure, product management, developer experience & analytics to ensure technical sustainability, risks and trade-offs are well understood and managed
  • Act as a force-multiplier for your team through your definition and advocacy of technical solutions and operational processes
  • Take ownership of your team’s operations and availability by ensuring you have the right monitoring, triage rotations, playbooks, policies, testing and alerting in place to support “keep the lights on” & on-call efforts
  • Foster a culture of quality and ownership on your team by setting code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks
  • Help develop talent on your team by providing feedback and guidance, and leading by example

Benefits

  • general: Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
  • general: Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • general: Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
  • general: ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount

Target Your Resume for "Staff Software Engineer - SRE, Backend (Reliability Engineering)" , Affirm

Get personalized recommendations to optimize your resume specifically for Staff Software Engineer - SRE, Backend (Reliability Engineering). Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Staff Software Engineer - SRE, Backend (Reliability Engineering)" , Affirm

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Infrastructure Platform EngInfrastructure Platform Eng
Quiz Challenge

Answer 10 quick questions to check your fit for Staff Software Engineer - SRE, Backend (Reliability Engineering) @ Affirm.

10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.