Resume and JobRESUME AND JOB
Crusoe logo

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

Crusoe

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

full-timePosted: Nov 2, 2025

Job Description

Senior Infrastructure Engineer at Crusoe - San Francisco, California

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

Role Overview

We are seeking a highly skilled and motivated Senior Infrastructure Engineer to join Crusoe’s Fleet Operations team. This role is focused on the advanced diagnosis, maintenance, and repair of high-performance GPU compute clusters, ensuring maximum uptime, reliability, and performance across our fleet.

The ideal candidate will be hands-on with GPU rack-level troubleshooting and work closely with data center operations, engineering, and vendors to support cutting-edge infrastructure featuring the latest NVIDIA and AMD GPUs. This position plays a critical role in maintaining the health and scalability of Crusoe’s rapidly growing GPU fleet.

A Day in the Life

As a Senior Infrastructure Engineer, your day will be dynamic and challenging. You'll start by reviewing the overnight performance reports of Crusoe's GPU fleet, identifying any potential issues or areas of concern. You'll then dive into troubleshooting specific hardware faults within GPU racks and high-density compute systems, utilizing your deep understanding of GPU architectures to pinpoint the root cause of problems.

You'll collaborate with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, and cooling systems. After repairs, you'll conduct post-repair validation, burn-in testing, and NVIDIA NCCL testing to ensure system stability and performance. A significant part of your day will also involve implementing and executing preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.

You'll also be responsible for maintaining detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems. Developing and updating standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows will be another key aspect of your responsibilities. You'll collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.

Why San Francisco?

San Francisco is a global hub for technology and innovation, making it the ideal location for Crusoe. The city offers a vibrant ecosystem of talent, resources, and opportunities, allowing you to collaborate with some of the brightest minds in the industry. Additionally, San Francisco provides a rich cultural experience, with world-class dining, arts, and entertainment options.

Career Path

This Senior Infrastructure Engineer role offers a clear career path within Crusoe. You can progress to roles such as a Lead Infrastructure Engineer, where you'll be responsible for overseeing a team of engineers and managing larger-scale projects. Alternatively, you can specialize in a specific area, such as GPU architecture or data center operations, and become a subject matter expert.

Salary & Benefits

Crusoe offers a competitive salary and benefits package for the Senior Infrastructure Engineer role. The estimated salary range for this position in San Francisco, California is $160,000 to $220,000 per year. This range is based on industry standards, experience, and qualifications. Specific compensation packages are determined by interviewing performance.

In addition to a competitive salary, Crusoe provides a comprehensive benefits package that includes:

  • Comprehensive health insurance (medical, dental, vision)
  • Generous paid time off (PTO) policy
  • Paid holidays
  • 401(k) retirement plan with company match
  • Employee stock options
  • Professional development opportunities
  • Wellness programs
  • Life insurance
  • Disability insurance
  • Flexible spending accounts (FSA)
  • Employee assistance program (EAP)
  • Commuter benefits
  • Company-sponsored events and activities

Crusoe Culture

Crusoe fosters a culture of innovation, collaboration, and sustainability. We are committed to building a diverse and inclusive workplace where everyone feels valued and respected. We encourage our employees to think outside the box, challenge the status quo, and make a positive impact on the world.

How to Apply

If you are a highly skilled and motivated infrastructure engineer with a passion for GPU technology and sustainable computing, we encourage you to apply for the Senior Infrastructure Engineer position at Crusoe. Please submit your resume and cover letter through our online application portal.

Frequently Asked Questions (FAQ)

  1. What is Crusoe's mission?

    Crusoe's mission is to accelerate the abundance of energy and intelligence.

  2. What are the key responsibilities of a Senior Infrastructure Engineer at Crusoe?

    The key responsibilities include diagnosing and repairing GPU hardware, troubleshooting system issues, and collaborating with other teams to improve system performance and reliability.

  3. What qualifications are required for this role?

    The qualifications include experience in diagnosing and repairing high-density compute hardware, a deep understanding of GPU architectures, and proficiency with Linux and diagnostic tools.

  4. What is the salary range for this position?

    The estimated salary range for this position in San Francisco, California is $160,000 to $220,000 per year.

  5. What benefits does Crusoe offer?

    Crusoe offers a comprehensive benefits package that includes health insurance, PTO, a 401(k) plan, and employee stock options.

  6. What is the work environment like at Crusoe?

    Crusoe fosters a culture of innovation, collaboration, and sustainability.

  7. What opportunities are there for career advancement at Crusoe?

    There are opportunities to progress to roles such as Lead Infrastructure Engineer or to specialize in a specific area of expertise.

  8. Is relocation assistance provided for this position?

    Relocation assistance may be provided, depending on the candidate's situation.

  9. What kind of testing is conducted to ensure system stability?

    Post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing are conducted.

  10. What high-speed interconnects should the candidate be familiar with?

    The candidate should be familiar with InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE).

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangemedium confidence

176,000 - 242,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU architectureintermediate
  • NVIDIA A100intermediate
  • NVIDIA H200intermediate
  • NVIDIA GB200intermediate
  • NVIDIA B200intermediate
  • AMD 350Xintermediate
  • AMD 355Xintermediate
  • InfiniBandintermediate
  • NVLinkintermediate
  • RDMA over Converged Ethernet (RoCE)intermediate
  • Linux (Ubuntu, Rocky Linux, CentOS)intermediate
  • NVIDIA DCGMintermediate
  • NVIDIA field diagnostic utilitiesintermediate
  • Enterprise server hardwareintermediate
  • Power delivery systemsintermediate
  • Cooling systemsintermediate
  • Troubleshootingintermediate
  • Hardware repairintermediate
  • Data center operationsintermediate
  • System diagnosticsintermediate
  • Preventative maintenanceintermediate
  • Firmware upgradesintermediate
  • BIOS upgradesintermediate
  • Documentationintermediate
  • SOP developmentintermediate

Required Qualifications

  • Proven experience diagnosing and repairing high-density, rack-mounted compute hardware in production environments. (experience)
  • Deep understanding of GPU architectures and hands-on experience with GPU-based systems. (experience)
  • Experience supporting NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X series platforms. (experience)
  • Familiarity with high-speed interconnects such as InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE). (experience)
  • Strong Linux experience (Ubuntu, Rocky Linux, CentOS) using the command line for diagnostics and testing. (experience)
  • Proficiency with GPU and system diagnostic tools such as NVIDIA DCGM and NVIDIA field diagnostic utilities. (experience)
  • Experience working with enterprise server hardware, power delivery, and cooling systems. (experience)
  • Strong analytical and problem-solving skills. (experience)
  • Excellent communication and collaboration skills. (experience)
  • Ability to work independently in a fast-paced data center or operations environment. (experience)

Responsibilities

  • Perform deep-level diagnosis and troubleshooting of hardware faults within GPU racks and high-density compute systems.
  • Troubleshoot and support GPU platforms including NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X.
  • Execute component-level diagnosis and remediation for failed or degraded hardware.
  • Partner with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, cooling systems, interconnects, and networking hardware.
  • Conduct post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing to ensure system stability and performance.
  • Implement and execute preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.
  • Perform firmware and BIOS upgrades across the GPU fleet.
  • Maintain detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems.
  • Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows.
  • Collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.
  • Manage and prioritize workload to ensure timely completion of tasks.
  • Contribute to the continuous improvement of processes and procedures.

Benefits

  • general: Comprehensive health insurance (medical, dental, vision)
  • general: Generous paid time off (PTO) policy
  • general: Paid holidays
  • general: 401(k) retirement plan with company match
  • general: Employee stock options
  • general: Professional development opportunities
  • general: Wellness programs
  • general: Life insurance
  • general: Disability insurance
  • general: Flexible spending accounts (FSA)
  • general: Employee assistance program (EAP)
  • general: Commuter benefits
  • general: Company-sponsored events and activities
  • general: Relocation assistance (if applicable)

Target Your Resume for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Get personalized recommendations to optimize your resume specifically for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

InfrastructureGPU TechnologyNVIDIAAMDData Center OperationsLinuxTroubleshootingHardware RepairSan FranciscoCaliforniaSenior EngineerAIArtificial IntelligenceHigh Performance ComputingHPCCloud ComputingGPU ClustersSystem AdministrationDCGMNCCLInfinibandNVLinkRoCESenior Infrastructure EngineerGPUData centerHardware repairSystem diagnosticsPreventative maintenanceFirmware upgradesBIOS upgradesInfiniBandRDMAA100H200GB200B200350X355XCrusoe EnergyAI infrastructureSustainable computingGreen TechAI InfrastructureCloudEngineering

Answer 10 quick questions to check your fit for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Crusoe logo

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

Crusoe

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

full-timePosted: Nov 2, 2025

Job Description

Senior Infrastructure Engineer at Crusoe - San Francisco, California

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

Role Overview

We are seeking a highly skilled and motivated Senior Infrastructure Engineer to join Crusoe’s Fleet Operations team. This role is focused on the advanced diagnosis, maintenance, and repair of high-performance GPU compute clusters, ensuring maximum uptime, reliability, and performance across our fleet.

The ideal candidate will be hands-on with GPU rack-level troubleshooting and work closely with data center operations, engineering, and vendors to support cutting-edge infrastructure featuring the latest NVIDIA and AMD GPUs. This position plays a critical role in maintaining the health and scalability of Crusoe’s rapidly growing GPU fleet.

A Day in the Life

As a Senior Infrastructure Engineer, your day will be dynamic and challenging. You'll start by reviewing the overnight performance reports of Crusoe's GPU fleet, identifying any potential issues or areas of concern. You'll then dive into troubleshooting specific hardware faults within GPU racks and high-density compute systems, utilizing your deep understanding of GPU architectures to pinpoint the root cause of problems.

You'll collaborate with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, and cooling systems. After repairs, you'll conduct post-repair validation, burn-in testing, and NVIDIA NCCL testing to ensure system stability and performance. A significant part of your day will also involve implementing and executing preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.

You'll also be responsible for maintaining detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems. Developing and updating standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows will be another key aspect of your responsibilities. You'll collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.

Why San Francisco?

San Francisco is a global hub for technology and innovation, making it the ideal location for Crusoe. The city offers a vibrant ecosystem of talent, resources, and opportunities, allowing you to collaborate with some of the brightest minds in the industry. Additionally, San Francisco provides a rich cultural experience, with world-class dining, arts, and entertainment options.

Career Path

This Senior Infrastructure Engineer role offers a clear career path within Crusoe. You can progress to roles such as a Lead Infrastructure Engineer, where you'll be responsible for overseeing a team of engineers and managing larger-scale projects. Alternatively, you can specialize in a specific area, such as GPU architecture or data center operations, and become a subject matter expert.

Salary & Benefits

Crusoe offers a competitive salary and benefits package for the Senior Infrastructure Engineer role. The estimated salary range for this position in San Francisco, California is $160,000 to $220,000 per year. This range is based on industry standards, experience, and qualifications. Specific compensation packages are determined by interviewing performance.

In addition to a competitive salary, Crusoe provides a comprehensive benefits package that includes:

  • Comprehensive health insurance (medical, dental, vision)
  • Generous paid time off (PTO) policy
  • Paid holidays
  • 401(k) retirement plan with company match
  • Employee stock options
  • Professional development opportunities
  • Wellness programs
  • Life insurance
  • Disability insurance
  • Flexible spending accounts (FSA)
  • Employee assistance program (EAP)
  • Commuter benefits
  • Company-sponsored events and activities

Crusoe Culture

Crusoe fosters a culture of innovation, collaboration, and sustainability. We are committed to building a diverse and inclusive workplace where everyone feels valued and respected. We encourage our employees to think outside the box, challenge the status quo, and make a positive impact on the world.

How to Apply

If you are a highly skilled and motivated infrastructure engineer with a passion for GPU technology and sustainable computing, we encourage you to apply for the Senior Infrastructure Engineer position at Crusoe. Please submit your resume and cover letter through our online application portal.

Frequently Asked Questions (FAQ)

  1. What is Crusoe's mission?

    Crusoe's mission is to accelerate the abundance of energy and intelligence.

  2. What are the key responsibilities of a Senior Infrastructure Engineer at Crusoe?

    The key responsibilities include diagnosing and repairing GPU hardware, troubleshooting system issues, and collaborating with other teams to improve system performance and reliability.

  3. What qualifications are required for this role?

    The qualifications include experience in diagnosing and repairing high-density compute hardware, a deep understanding of GPU architectures, and proficiency with Linux and diagnostic tools.

  4. What is the salary range for this position?

    The estimated salary range for this position in San Francisco, California is $160,000 to $220,000 per year.

  5. What benefits does Crusoe offer?

    Crusoe offers a comprehensive benefits package that includes health insurance, PTO, a 401(k) plan, and employee stock options.

  6. What is the work environment like at Crusoe?

    Crusoe fosters a culture of innovation, collaboration, and sustainability.

  7. What opportunities are there for career advancement at Crusoe?

    There are opportunities to progress to roles such as Lead Infrastructure Engineer or to specialize in a specific area of expertise.

  8. Is relocation assistance provided for this position?

    Relocation assistance may be provided, depending on the candidate's situation.

  9. What kind of testing is conducted to ensure system stability?

    Post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing are conducted.

  10. What high-speed interconnects should the candidate be familiar with?

    The candidate should be familiar with InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE).

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangemedium confidence

176,000 - 242,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU architectureintermediate
  • NVIDIA A100intermediate
  • NVIDIA H200intermediate
  • NVIDIA GB200intermediate
  • NVIDIA B200intermediate
  • AMD 350Xintermediate
  • AMD 355Xintermediate
  • InfiniBandintermediate
  • NVLinkintermediate
  • RDMA over Converged Ethernet (RoCE)intermediate
  • Linux (Ubuntu, Rocky Linux, CentOS)intermediate
  • NVIDIA DCGMintermediate
  • NVIDIA field diagnostic utilitiesintermediate
  • Enterprise server hardwareintermediate
  • Power delivery systemsintermediate
  • Cooling systemsintermediate
  • Troubleshootingintermediate
  • Hardware repairintermediate
  • Data center operationsintermediate
  • System diagnosticsintermediate
  • Preventative maintenanceintermediate
  • Firmware upgradesintermediate
  • BIOS upgradesintermediate
  • Documentationintermediate
  • SOP developmentintermediate

Required Qualifications

  • Proven experience diagnosing and repairing high-density, rack-mounted compute hardware in production environments. (experience)
  • Deep understanding of GPU architectures and hands-on experience with GPU-based systems. (experience)
  • Experience supporting NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X series platforms. (experience)
  • Familiarity with high-speed interconnects such as InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE). (experience)
  • Strong Linux experience (Ubuntu, Rocky Linux, CentOS) using the command line for diagnostics and testing. (experience)
  • Proficiency with GPU and system diagnostic tools such as NVIDIA DCGM and NVIDIA field diagnostic utilities. (experience)
  • Experience working with enterprise server hardware, power delivery, and cooling systems. (experience)
  • Strong analytical and problem-solving skills. (experience)
  • Excellent communication and collaboration skills. (experience)
  • Ability to work independently in a fast-paced data center or operations environment. (experience)

Responsibilities

  • Perform deep-level diagnosis and troubleshooting of hardware faults within GPU racks and high-density compute systems.
  • Troubleshoot and support GPU platforms including NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X.
  • Execute component-level diagnosis and remediation for failed or degraded hardware.
  • Partner with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, cooling systems, interconnects, and networking hardware.
  • Conduct post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing to ensure system stability and performance.
  • Implement and execute preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.
  • Perform firmware and BIOS upgrades across the GPU fleet.
  • Maintain detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems.
  • Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows.
  • Collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.
  • Manage and prioritize workload to ensure timely completion of tasks.
  • Contribute to the continuous improvement of processes and procedures.

Benefits

  • general: Comprehensive health insurance (medical, dental, vision)
  • general: Generous paid time off (PTO) policy
  • general: Paid holidays
  • general: 401(k) retirement plan with company match
  • general: Employee stock options
  • general: Professional development opportunities
  • general: Wellness programs
  • general: Life insurance
  • general: Disability insurance
  • general: Flexible spending accounts (FSA)
  • general: Employee assistance program (EAP)
  • general: Commuter benefits
  • general: Company-sponsored events and activities
  • general: Relocation assistance (if applicable)

Target Your Resume for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Get personalized recommendations to optimize your resume specifically for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

InfrastructureGPU TechnologyNVIDIAAMDData Center OperationsLinuxTroubleshootingHardware RepairSan FranciscoCaliforniaSenior EngineerAIArtificial IntelligenceHigh Performance ComputingHPCCloud ComputingGPU ClustersSystem AdministrationDCGMNCCLInfinibandNVLinkRoCESenior Infrastructure EngineerGPUData centerHardware repairSystem diagnosticsPreventative maintenanceFirmware upgradesBIOS upgradesInfiniBandRDMAA100H200GB200B200350X355XCrusoe EnergyAI infrastructureSustainable computingGreen TechAI InfrastructureCloudEngineering

Answer 10 quick questions to check your fit for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.