Resume and JobRESUME AND JOB
Crusoe logo

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

Crusoe

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

full-timePosted: Nov 2, 2025

Job Description

Role Overview

Crusoe is at the forefront of the AI revolution, building sustainable infrastructure to power the future of computing. We are seeking a highly skilled and motivated Senior Infrastructure Engineer to join our Fleet Operations team. In this role, you will be responsible for the advanced diagnosis, maintenance, and repair of high-performance GPU compute clusters, ensuring maximum uptime, reliability, and performance across our fleet. You will be hands-on with GPU rack-level troubleshooting and work closely with data center operations, engineering, and vendors to support cutting-edge infrastructure featuring the latest NVIDIA and AMD GPUs. This position plays a critical role in maintaining the health and scalability of Crusoe’s rapidly growing GPU fleet.

A Day in the Life of a Senior Infrastructure Engineer at Crusoe

Your day will be dynamic and challenging, involving a mix of hands-on troubleshooting, strategic planning, and collaboration with various teams. Here’s a glimpse of what you can expect:

  • Morning: Start by reviewing the overnight monitoring reports and addressing any critical alerts related to GPU performance or hardware failures. Collaborate with data center operations to prioritize and schedule maintenance tasks.
  • Mid-day: Dive into deep-level diagnosis of hardware faults within GPU racks. Use diagnostic tools like NVIDIA DCGM and field diagnostic utilities to pinpoint the root cause of issues. Execute component-level remediation for failed or degraded hardware, such as GPUs, power supplies, or cooling systems.
  • Afternoon: Partner with engineering and software teams to investigate systemic failures and implement preventative solutions. Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows. Conduct post-repair validation and burn-in testing to ensure system stability and performance.
  • Collaboration: Throughout the day, you’ll collaborate with data center technicians, hardware vendors, and software engineers to resolve complex issues and optimize the performance of our GPU fleet.

Why San Francisco?

San Francisco is a global hub for technology and innovation, offering unparalleled opportunities for professional growth and networking. Located in the heart of Silicon Valley, San Francisco provides access to a vibrant ecosystem of startups, established tech companies, and leading research institutions. Crusoe’s presence in San Francisco allows you to be at the forefront of the AI revolution, surrounded by some of the brightest minds in the industry. The city also boasts a rich cultural scene, diverse culinary experiences, and numerous outdoor activities, making it an exciting place to live and work.

Career Path

At Crusoe, we are committed to the growth and development of our employees. The Senior Infrastructure Engineer role offers a clear career path with opportunities to advance into leadership positions within the Fleet Operations or Engineering teams. You can progress to roles such as:

  • Lead Infrastructure Engineer: Oversee a team of engineers and technicians, leading complex projects and driving continuous improvement in fleet operations.
  • Principal Infrastructure Engineer: Serve as a technical expert and strategic advisor, providing guidance on infrastructure design, optimization, and scalability.
  • Manager, Fleet Operations: Lead the entire Fleet Operations team, responsible for the overall health, reliability, and performance of Crusoe’s GPU fleet.

Salary and Benefits

Crusoe offers a competitive salary and benefits package commensurate with experience and qualifications. The estimated salary range for this position in San Francisco is $160,000 to $220,000 per year. In addition to a competitive salary, Crusoe provides a comprehensive benefits package that includes:

  • Comprehensive health, dental, and vision insurance
  • Generous paid time off and holiday policy
  • 401(k) plan with company match
  • Professional development opportunities and stipend
  • Wellness programs and resources
  • Employee assistance program (EAP)
  • Flexible work arrangements where possible
  • Company-sponsored events and team-building activities
  • Commuter benefits
  • Stocked kitchen with snacks and beverages

Crusoe Culture

At Crusoe, we foster a culture of innovation, collaboration, and sustainability. We are passionate about solving complex challenges and creating a positive impact on the world. Our team is composed of talented individuals from diverse backgrounds who are driven by a shared commitment to excellence. We value open communication, intellectual curiosity, and a willingness to learn and grow. We believe in empowering our employees to take ownership of their work and make a meaningful contribution to the company’s success.

How to Apply

If you are a highly motivated and skilled engineer with a passion for GPU technology and sustainable computing, we encourage you to apply. Please submit your resume and cover letter through our online application portal. Be sure to highlight your relevant experience and qualifications, and explain why you are interested in joining the Crusoe team.

Frequently Asked Questions

  1. What is Crusoe's mission?

    Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

  2. What type of GPUs does Crusoe use?

    Crusoe uses a variety of high-performance GPUs from NVIDIA and AMD, including A100, H200, GB200, B200, 350X, and 355X series platforms.

  3. What are the key responsibilities of a Senior Infrastructure Engineer at Crusoe?

    Key responsibilities include diagnosing and repairing hardware faults in GPU racks, troubleshooting GPU platforms, performing component-level remediation, and collaborating with data center operations and engineering teams.

  4. What qualifications are required for this role?

    Required qualifications include experience diagnosing and repairing high-density compute hardware, a deep understanding of GPU architectures, and strong Linux experience.

  5. What is the work environment like at Crusoe?

    Crusoe offers a fast-paced, collaborative, and innovative work environment where employees are encouraged to take ownership and make a meaningful impact.

  6. What opportunities for professional development are available at Crusoe?

    Crusoe provides professional development opportunities and a stipend to support employee growth and learning.

  7. What is the salary range for this position?

    The estimated salary range for this position in San Francisco is $160,000 to $220,000 per year.

  8. What benefits does Crusoe offer?

    Crusoe offers a comprehensive benefits package that includes health, dental, and vision insurance, paid time off, a 401(k) plan, and more.

  9. How does Crusoe support sustainability?

    Crusoe is committed to sustainable computing practices, utilizing innovative technologies to reduce energy consumption and minimize environmental impact.

  10. What is the career path for a Senior Infrastructure Engineer at Crusoe?

    The career path includes opportunities to advance into leadership positions such as Lead Infrastructure Engineer, Principal Infrastructure Engineer, or Manager, Fleet Operations.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangemedium confidence

176,000 - 242,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU architectureintermediate
  • NVIDIA A100intermediate
  • NVIDIA H200intermediate
  • NVIDIA GB200intermediate
  • NVIDIA B200intermediate
  • AMD 350Xintermediate
  • AMD 355Xintermediate
  • InfiniBandintermediate
  • NVLinkintermediate
  • RDMA over Converged Ethernet (RoCE)intermediate
  • Linux (Ubuntu, Rocky Linux, CentOS)intermediate
  • NVIDIA DCGMintermediate
  • NVIDIA field diagnostic utilitiesintermediate
  • Enterprise server hardwareintermediate
  • Power delivery systemsintermediate
  • Cooling systemsintermediate
  • Troubleshootingintermediate
  • Hardware repairintermediate
  • Data center operationsintermediate
  • FRU Repairintermediate

Required Qualifications

  • Proven experience diagnosing and repairing high-density, rack-mounted compute hardware in production environments. (experience)
  • Deep understanding of GPU architectures and hands-on experience with GPU-based systems. (experience)
  • Experience supporting NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X series platforms. (experience)
  • Familiarity with high-speed interconnects such as InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE). (experience)
  • Strong Linux experience (Ubuntu, Rocky Linux, CentOS) using the command line for diagnostics and testing. (experience)
  • Proficiency with GPU and system diagnostic tools such as NVIDIA DCGM and NVIDIA field diagnostic utilities. (experience)
  • Experience working with enterprise server hardware, power delivery, and cooling systems. (experience)
  • Strong analytical and problem-solving skills. (experience)
  • Excellent communication and collaboration skills. (experience)
  • Ability to work independently in a fast-paced data center or operations environment. (experience)
  • Technical certification or Associate’s/Bachelor’s degree in Electrical Engineering, Computer Science, or a related field or demonstrated experience. (experience)
  • Experience working directly with hardware vendors and escalations. (experience)

Responsibilities

  • Perform deep-level diagnosis and troubleshooting of hardware faults within GPU racks and high-density compute systems.
  • Troubleshoot and support GPU platforms including NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X.
  • Execute component-level diagnosis and remediation for failed or degraded hardware.
  • Partner with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, cooling systems, interconnects, and networking hardware.
  • Conduct post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing to ensure system stability and performance.
  • Implement and execute preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.
  • Perform firmware and BIOS upgrades across the GPU fleet.
  • Maintain detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems.
  • Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows.
  • Collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.
  • Manage and maintain inventory of spare parts and tools required for GPU fleet maintenance.
  • Participate in on-call rotation for emergency support and troubleshooting.

Benefits

  • general: Competitive salary and equity options.
  • general: Comprehensive health, dental, and vision insurance.
  • general: Generous paid time off and holiday policy.
  • general: 401(k) plan with company match.
  • general: Professional development opportunities and stipend.
  • general: Wellness programs and resources.
  • general: Employee assistance program (EAP).
  • general: Flexible work arrangements where possible.
  • general: Company-sponsored events and team-building activities.
  • general: Commuter benefits.
  • general: Stocked kitchen with snacks and beverages.
  • general: Opportunity to work on cutting-edge technology in a rapidly growing company.
  • general: Collaborative and inclusive work environment.
  • general: Meaningful impact on the future of AI and sustainable computing.
  • general: Relocation assistance (if applicable).
  • general: Life insurance.

Target Your Resume for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Get personalized recommendations to optimize your resume specifically for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

InfrastructureGPUData CenterNVIDIAAMDLinuxSenior Infrastructure EngineerA100H200GB200B200350X355XSan FranciscoCaliforniaFleet OperationsTroubleshootingHardware RepairDCGMCompute ClustersHigh-Performance ComputingInfiniBandNVLinkRDMASustainable ComputingAI InfrastructureCrusoe Energy SystemsRack-mounted serversFRU repairPreventative MaintenanceGreen TechAI InfrastructureCloudEngineering

Answer 10 quick questions to check your fit for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Crusoe logo

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

Crusoe

High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!

full-timePosted: Nov 2, 2025

Job Description

Role Overview

Crusoe is at the forefront of the AI revolution, building sustainable infrastructure to power the future of computing. We are seeking a highly skilled and motivated Senior Infrastructure Engineer to join our Fleet Operations team. In this role, you will be responsible for the advanced diagnosis, maintenance, and repair of high-performance GPU compute clusters, ensuring maximum uptime, reliability, and performance across our fleet. You will be hands-on with GPU rack-level troubleshooting and work closely with data center operations, engineering, and vendors to support cutting-edge infrastructure featuring the latest NVIDIA and AMD GPUs. This position plays a critical role in maintaining the health and scalability of Crusoe’s rapidly growing GPU fleet.

A Day in the Life of a Senior Infrastructure Engineer at Crusoe

Your day will be dynamic and challenging, involving a mix of hands-on troubleshooting, strategic planning, and collaboration with various teams. Here’s a glimpse of what you can expect:

  • Morning: Start by reviewing the overnight monitoring reports and addressing any critical alerts related to GPU performance or hardware failures. Collaborate with data center operations to prioritize and schedule maintenance tasks.
  • Mid-day: Dive into deep-level diagnosis of hardware faults within GPU racks. Use diagnostic tools like NVIDIA DCGM and field diagnostic utilities to pinpoint the root cause of issues. Execute component-level remediation for failed or degraded hardware, such as GPUs, power supplies, or cooling systems.
  • Afternoon: Partner with engineering and software teams to investigate systemic failures and implement preventative solutions. Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows. Conduct post-repair validation and burn-in testing to ensure system stability and performance.
  • Collaboration: Throughout the day, you’ll collaborate with data center technicians, hardware vendors, and software engineers to resolve complex issues and optimize the performance of our GPU fleet.

Why San Francisco?

San Francisco is a global hub for technology and innovation, offering unparalleled opportunities for professional growth and networking. Located in the heart of Silicon Valley, San Francisco provides access to a vibrant ecosystem of startups, established tech companies, and leading research institutions. Crusoe’s presence in San Francisco allows you to be at the forefront of the AI revolution, surrounded by some of the brightest minds in the industry. The city also boasts a rich cultural scene, diverse culinary experiences, and numerous outdoor activities, making it an exciting place to live and work.

Career Path

At Crusoe, we are committed to the growth and development of our employees. The Senior Infrastructure Engineer role offers a clear career path with opportunities to advance into leadership positions within the Fleet Operations or Engineering teams. You can progress to roles such as:

  • Lead Infrastructure Engineer: Oversee a team of engineers and technicians, leading complex projects and driving continuous improvement in fleet operations.
  • Principal Infrastructure Engineer: Serve as a technical expert and strategic advisor, providing guidance on infrastructure design, optimization, and scalability.
  • Manager, Fleet Operations: Lead the entire Fleet Operations team, responsible for the overall health, reliability, and performance of Crusoe’s GPU fleet.

Salary and Benefits

Crusoe offers a competitive salary and benefits package commensurate with experience and qualifications. The estimated salary range for this position in San Francisco is $160,000 to $220,000 per year. In addition to a competitive salary, Crusoe provides a comprehensive benefits package that includes:

  • Comprehensive health, dental, and vision insurance
  • Generous paid time off and holiday policy
  • 401(k) plan with company match
  • Professional development opportunities and stipend
  • Wellness programs and resources
  • Employee assistance program (EAP)
  • Flexible work arrangements where possible
  • Company-sponsored events and team-building activities
  • Commuter benefits
  • Stocked kitchen with snacks and beverages

Crusoe Culture

At Crusoe, we foster a culture of innovation, collaboration, and sustainability. We are passionate about solving complex challenges and creating a positive impact on the world. Our team is composed of talented individuals from diverse backgrounds who are driven by a shared commitment to excellence. We value open communication, intellectual curiosity, and a willingness to learn and grow. We believe in empowering our employees to take ownership of their work and make a meaningful contribution to the company’s success.

How to Apply

If you are a highly motivated and skilled engineer with a passion for GPU technology and sustainable computing, we encourage you to apply. Please submit your resume and cover letter through our online application portal. Be sure to highlight your relevant experience and qualifications, and explain why you are interested in joining the Crusoe team.

Frequently Asked Questions

  1. What is Crusoe's mission?

    Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

  2. What type of GPUs does Crusoe use?

    Crusoe uses a variety of high-performance GPUs from NVIDIA and AMD, including A100, H200, GB200, B200, 350X, and 355X series platforms.

  3. What are the key responsibilities of a Senior Infrastructure Engineer at Crusoe?

    Key responsibilities include diagnosing and repairing hardware faults in GPU racks, troubleshooting GPU platforms, performing component-level remediation, and collaborating with data center operations and engineering teams.

  4. What qualifications are required for this role?

    Required qualifications include experience diagnosing and repairing high-density compute hardware, a deep understanding of GPU architectures, and strong Linux experience.

  5. What is the work environment like at Crusoe?

    Crusoe offers a fast-paced, collaborative, and innovative work environment where employees are encouraged to take ownership and make a meaningful impact.

  6. What opportunities for professional development are available at Crusoe?

    Crusoe provides professional development opportunities and a stipend to support employee growth and learning.

  7. What is the salary range for this position?

    The estimated salary range for this position in San Francisco is $160,000 to $220,000 per year.

  8. What benefits does Crusoe offer?

    Crusoe offers a comprehensive benefits package that includes health, dental, and vision insurance, paid time off, a 401(k) plan, and more.

  9. How does Crusoe support sustainability?

    Crusoe is committed to sustainable computing practices, utilizing innovative technologies to reduce energy consumption and minimize environmental impact.

  10. What is the career path for a Senior Infrastructure Engineer at Crusoe?

    The career path includes opportunities to advance into leadership positions such as Lead Infrastructure Engineer, Principal Infrastructure Engineer, or Manager, Fleet Operations.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangemedium confidence

176,000 - 242,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • GPU architectureintermediate
  • NVIDIA A100intermediate
  • NVIDIA H200intermediate
  • NVIDIA GB200intermediate
  • NVIDIA B200intermediate
  • AMD 350Xintermediate
  • AMD 355Xintermediate
  • InfiniBandintermediate
  • NVLinkintermediate
  • RDMA over Converged Ethernet (RoCE)intermediate
  • Linux (Ubuntu, Rocky Linux, CentOS)intermediate
  • NVIDIA DCGMintermediate
  • NVIDIA field diagnostic utilitiesintermediate
  • Enterprise server hardwareintermediate
  • Power delivery systemsintermediate
  • Cooling systemsintermediate
  • Troubleshootingintermediate
  • Hardware repairintermediate
  • Data center operationsintermediate
  • FRU Repairintermediate

Required Qualifications

  • Proven experience diagnosing and repairing high-density, rack-mounted compute hardware in production environments. (experience)
  • Deep understanding of GPU architectures and hands-on experience with GPU-based systems. (experience)
  • Experience supporting NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X series platforms. (experience)
  • Familiarity with high-speed interconnects such as InfiniBand, NVLink, and RDMA over Converged Ethernet (RoCE). (experience)
  • Strong Linux experience (Ubuntu, Rocky Linux, CentOS) using the command line for diagnostics and testing. (experience)
  • Proficiency with GPU and system diagnostic tools such as NVIDIA DCGM and NVIDIA field diagnostic utilities. (experience)
  • Experience working with enterprise server hardware, power delivery, and cooling systems. (experience)
  • Strong analytical and problem-solving skills. (experience)
  • Excellent communication and collaboration skills. (experience)
  • Ability to work independently in a fast-paced data center or operations environment. (experience)
  • Technical certification or Associate’s/Bachelor’s degree in Electrical Engineering, Computer Science, or a related field or demonstrated experience. (experience)
  • Experience working directly with hardware vendors and escalations. (experience)

Responsibilities

  • Perform deep-level diagnosis and troubleshooting of hardware faults within GPU racks and high-density compute systems.
  • Troubleshoot and support GPU platforms including NVIDIA A100, H200, GB200, B200 and AMD 350X / 355X.
  • Execute component-level diagnosis and remediation for failed or degraded hardware.
  • Partner with data center operations to manage and perform field-replaceable unit (FRU) repairs for GPUs, power supplies, cooling systems, interconnects, and networking hardware.
  • Conduct post-repair validation, burn-in testing, torch testing, and NVIDIA NCCL testing to ensure system stability and performance.
  • Implement and execute preventative maintenance procedures to improve fleet reliability and extend hardware lifespan.
  • Perform firmware and BIOS upgrades across the GPU fleet.
  • Maintain detailed documentation of maintenance activities, failures, and resolutions in ticketing and asset management systems.
  • Develop and update standard operating procedures (SOPs) for troubleshooting, repair, and validation workflows.
  • Collaborate with engineering, software, and data center operations teams to identify root causes of systemic failures and implement preventative solutions.
  • Manage and maintain inventory of spare parts and tools required for GPU fleet maintenance.
  • Participate in on-call rotation for emergency support and troubleshooting.

Benefits

  • general: Competitive salary and equity options.
  • general: Comprehensive health, dental, and vision insurance.
  • general: Generous paid time off and holiday policy.
  • general: 401(k) plan with company match.
  • general: Professional development opportunities and stipend.
  • general: Wellness programs and resources.
  • general: Employee assistance program (EAP).
  • general: Flexible work arrangements where possible.
  • general: Company-sponsored events and team-building activities.
  • general: Commuter benefits.
  • general: Stocked kitchen with snacks and beverages.
  • general: Opportunity to work on cutting-edge technology in a rapidly growing company.
  • general: Collaborative and inclusive work environment.
  • general: Meaningful impact on the future of AI and sustainable computing.
  • general: Relocation assistance (if applicable).
  • general: Life insurance.

Target Your Resume for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Get personalized recommendations to optimize your resume specifically for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now!" , Crusoe

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

InfrastructureGPUData CenterNVIDIAAMDLinuxSenior Infrastructure EngineerA100H200GB200B200350X355XSan FranciscoCaliforniaFleet OperationsTroubleshootingHardware RepairDCGMCompute ClustersHigh-Performance ComputingInfiniBandNVLinkRDMASustainable ComputingAI InfrastructureCrusoe Energy SystemsRack-mounted serversFRU repairPreventative MaintenanceGreen TechAI InfrastructureCloudEngineering

Answer 10 quick questions to check your fit for High-CTR: Senior Infrastructure Engineer Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.