RESUME AND JOB
Crusoe
Crusoe's mission is to align the future of energy with the future of computing. We build innovative technologies that reduce both the costs and the environmental impact of the world’s expanding digital infrastructure. By creating mutually beneficial relationships between the energy and digital sectors, we unlock stranded energy resources, lower the costs of computation, and pave the way for a more sustainable and prosperous future.
Our focus is on providing cloud computing solutions powered by otherwise wasted energy sources, reducing flaring and emissions while simultaneously supporting compute-intensive workloads such as AI and machine learning. We are a team of innovators, problem-solvers, and visionaries dedicated to making a tangible difference in the world. Join us on our journey to accelerate the abundance of energy and intelligence in a sustainable manner.
As a Staff Site Reliability Engineer (SRE) specializing in Managed AI at Crusoe, you will play a critical role in ensuring the reliability, scalability, and performance of our AI-optimized cloud platform. You will be responsible for designing, building, and operating the infrastructure that powers large language models (LLMs) and other AI services at scale. This role requires a deep understanding of distributed systems, a passion for automation, and a commitment to delivering exceptional service to our customers.
You will collaborate closely with AI, platform, and infrastructure teams to optimize the entire AI pipeline, from training to inference. Your contributions will directly impact the efficiency and effectiveness of our AI services, enabling our customers to push the boundaries of innovation in various industries. You will be a key member of a dynamic team that is at the forefront of sustainable and transformative cloud infrastructure.
Here’s a glimpse into what your day-to-day might look like:
San Francisco is a hub of technological innovation and a vibrant ecosystem for AI and machine learning. Located in the heart of Silicon Valley, this city offers unparalleled opportunities for professional growth and networking. By working in our San Francisco office, you will be surrounded by some of the brightest minds in the industry, with access to cutting-edge research and development.
Beyond the professional advantages, San Francisco boasts a rich cultural scene, diverse neighborhoods, and stunning natural beauty. From the Golden Gate Bridge to the vibrant arts and culinary scene, there’s always something new to explore. The city's commitment to sustainability also aligns perfectly with Crusoe's mission, making it an ideal place to live and work.
At Crusoe, we are committed to fostering the growth and development of our employees. As a Staff Site Reliability Engineer, you will have opportunities to advance your career through:
Crusoe offers a competitive salary and comprehensive benefits package that includes:
Our culture is built on a foundation of innovation, collaboration, and sustainability. We value diversity, creativity, and a passion for making a difference. At Crusoe, you will be part of a team that is committed to:
If you are passionate about AI, distributed systems, and sustainability, and you are looking for a challenging and rewarding career, we encourage you to apply for the Staff Site Reliability Engineer, Managed AI position at Crusoe. To apply, please submit your resume and a cover letter outlining your qualifications and experience through our online application portal.
What is Crusoe's mission?
Crusoe's mission is to align the future of energy with the future of computing by building innovative technologies that reduce both the costs and environmental impact of expanding digital infrastructure.
What is Site Reliability Engineering (SRE)?
Site Reliability Engineering (SRE) is a discipline that applies software engineering principles to infrastructure operations. The goal of SRE is to automate operational tasks, improve system reliability, and ensure scalability and performance.
What are Large Language Models (LLMs)?
Large language models (LLMs) are a type of artificial intelligence model that is trained on vast amounts of text data to understand and generate human-like text. They are used in a variety of applications, including chatbots, content creation, and machine translation.
What is Kubernetes?
Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It is widely used in cloud-native environments to manage complex workloads.
What kind of experience is Crusoe looking for in a Staff SRE?
Crusoe is looking for candidates with a strong software engineering background, experience in distributed systems design, hands-on experience with LLMs or AI/ML infrastructure, and a solid understanding of SRE principles.
What programming languages are preferred for this role?
Proficiency in at least one modern programming language, such as Python, Go, Java, or C++, is required.
What are the key responsibilities of a Staff SRE at Crusoe?
Key responsibilities include designing and operating reliable managed AI services, building automation and reliability tooling, defining and measuring SLIs/SLOs, collaborating with other teams to optimize AI pipelines, and investigating and resolving reliability issues.
What benefits does Crusoe offer?
Crusoe offers a competitive salary, restricted stock units, comprehensive health insurance, HSA contributions, paid parental leave, life and disability insurance, Teladoc access, a 401(k) plan with a company match, generous paid time off, and additional perks like cell phone reimbursement and tuition reimbursement.
How does Crusoe contribute to sustainability?
Crusoe contributes to sustainability by using otherwise wasted energy sources to power cloud computing solutions, reducing flaring and emissions while supporting compute-intensive workloads such as AI and machine learning.
What is the work environment like at Crusoe?
Crusoe fosters a culture of innovation, collaboration, and sustainability. The work environment is dynamic, fast-paced, and mission-driven, with a focus on continuous learning and open communication.
198,000 - 308,000 USD / yearly
Source: ai estimated
* This is an estimated range based on market data and may vary based on experience and qualifications.
Get personalized recommendations to optimize your resume specifically for Staff Site Reliability Engineer, Managed AI Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!
Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.
Answer 10 quick questions to check your fit for Staff Site Reliability Engineer, Managed AI Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

No related jobs found at the moment.

© 2026 Pointers. All rights reserved.

Crusoe
Crusoe's mission is to align the future of energy with the future of computing. We build innovative technologies that reduce both the costs and the environmental impact of the world’s expanding digital infrastructure. By creating mutually beneficial relationships between the energy and digital sectors, we unlock stranded energy resources, lower the costs of computation, and pave the way for a more sustainable and prosperous future.
Our focus is on providing cloud computing solutions powered by otherwise wasted energy sources, reducing flaring and emissions while simultaneously supporting compute-intensive workloads such as AI and machine learning. We are a team of innovators, problem-solvers, and visionaries dedicated to making a tangible difference in the world. Join us on our journey to accelerate the abundance of energy and intelligence in a sustainable manner.
As a Staff Site Reliability Engineer (SRE) specializing in Managed AI at Crusoe, you will play a critical role in ensuring the reliability, scalability, and performance of our AI-optimized cloud platform. You will be responsible for designing, building, and operating the infrastructure that powers large language models (LLMs) and other AI services at scale. This role requires a deep understanding of distributed systems, a passion for automation, and a commitment to delivering exceptional service to our customers.
You will collaborate closely with AI, platform, and infrastructure teams to optimize the entire AI pipeline, from training to inference. Your contributions will directly impact the efficiency and effectiveness of our AI services, enabling our customers to push the boundaries of innovation in various industries. You will be a key member of a dynamic team that is at the forefront of sustainable and transformative cloud infrastructure.
Here’s a glimpse into what your day-to-day might look like:
San Francisco is a hub of technological innovation and a vibrant ecosystem for AI and machine learning. Located in the heart of Silicon Valley, this city offers unparalleled opportunities for professional growth and networking. By working in our San Francisco office, you will be surrounded by some of the brightest minds in the industry, with access to cutting-edge research and development.
Beyond the professional advantages, San Francisco boasts a rich cultural scene, diverse neighborhoods, and stunning natural beauty. From the Golden Gate Bridge to the vibrant arts and culinary scene, there’s always something new to explore. The city's commitment to sustainability also aligns perfectly with Crusoe's mission, making it an ideal place to live and work.
At Crusoe, we are committed to fostering the growth and development of our employees. As a Staff Site Reliability Engineer, you will have opportunities to advance your career through:
Crusoe offers a competitive salary and comprehensive benefits package that includes:
Our culture is built on a foundation of innovation, collaboration, and sustainability. We value diversity, creativity, and a passion for making a difference. At Crusoe, you will be part of a team that is committed to:
If you are passionate about AI, distributed systems, and sustainability, and you are looking for a challenging and rewarding career, we encourage you to apply for the Staff Site Reliability Engineer, Managed AI position at Crusoe. To apply, please submit your resume and a cover letter outlining your qualifications and experience through our online application portal.
What is Crusoe's mission?
Crusoe's mission is to align the future of energy with the future of computing by building innovative technologies that reduce both the costs and environmental impact of expanding digital infrastructure.
What is Site Reliability Engineering (SRE)?
Site Reliability Engineering (SRE) is a discipline that applies software engineering principles to infrastructure operations. The goal of SRE is to automate operational tasks, improve system reliability, and ensure scalability and performance.
What are Large Language Models (LLMs)?
Large language models (LLMs) are a type of artificial intelligence model that is trained on vast amounts of text data to understand and generate human-like text. They are used in a variety of applications, including chatbots, content creation, and machine translation.
What is Kubernetes?
Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It is widely used in cloud-native environments to manage complex workloads.
What kind of experience is Crusoe looking for in a Staff SRE?
Crusoe is looking for candidates with a strong software engineering background, experience in distributed systems design, hands-on experience with LLMs or AI/ML infrastructure, and a solid understanding of SRE principles.
What programming languages are preferred for this role?
Proficiency in at least one modern programming language, such as Python, Go, Java, or C++, is required.
What are the key responsibilities of a Staff SRE at Crusoe?
Key responsibilities include designing and operating reliable managed AI services, building automation and reliability tooling, defining and measuring SLIs/SLOs, collaborating with other teams to optimize AI pipelines, and investigating and resolving reliability issues.
What benefits does Crusoe offer?
Crusoe offers a competitive salary, restricted stock units, comprehensive health insurance, HSA contributions, paid parental leave, life and disability insurance, Teladoc access, a 401(k) plan with a company match, generous paid time off, and additional perks like cell phone reimbursement and tuition reimbursement.
How does Crusoe contribute to sustainability?
Crusoe contributes to sustainability by using otherwise wasted energy sources to power cloud computing solutions, reducing flaring and emissions while supporting compute-intensive workloads such as AI and machine learning.
What is the work environment like at Crusoe?
Crusoe fosters a culture of innovation, collaboration, and sustainability. The work environment is dynamic, fast-paced, and mission-driven, with a focus on continuous learning and open communication.
198,000 - 308,000 USD / yearly
Source: ai estimated
* This is an estimated range based on market data and may vary based on experience and qualifications.
Get personalized recommendations to optimize your resume specifically for Staff Site Reliability Engineer, Managed AI Careers at Crusoe - San Francisco, California | Apply Now!. Takes only 15 seconds!
Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.
Answer 10 quick questions to check your fit for Staff Site Reliability Engineer, Managed AI Careers at Crusoe - San Francisco, California | Apply Now! @ Crusoe.

No related jobs found at the moment.

© 2026 Pointers. All rights reserved.