Are you interested in working on cutting-edge cloud security products Would you like to be part of one of the world’s most advanced cyber-security solutions and protect millions of computers from thousands of active attack attempts, every month Look no further than the Microsoft Defender engineering team. We are looking for a Senior Site Reliability Engineering (SRE) Manager. You will be building and delivering cloud solutions to meet the scale that few companies in the industry are required to support. Leveraging state-of-the-art technologies, you will be instrumental in delivering holistic protection within government environments. The Microsoft Defender team is responsible for delivering a constantly evolving set of services and solutions to meet the challenging landscape of our ever-evolving attackers. This is a team which provides on-call operational support and improvements to the operational posture of the Microsoft Defender products within US Government clouds. You will operate our production services, and work closely with other engineering teams to ensure services and systems are highly stable, meet performance SLAs, and meet the expectations of internal and external customers and users. The Microsoft Defender team is responsible for delivering a constantly evolving set of services and solutions to meet the challenging landscape of our ever-evolving attackers.
Locations
Multiple Locations, Multiple Locations, United States, Multiple Locations, Multiple Locations, United States
Reston, Virginia, United States, Reston, Virginia, United States
Redmond, Washington, United States, Redmond, Washington, United States
Salary
Salary not disclosed
Required Qualifications
Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience. (degree)
Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience. (degree)
The successful candidate must have an active U.S. Government Top-Secret Security Clearance. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate clearance and/or customer screening requirements may result in employment action up to and including termination. (degree)
Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment. (degree)
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter. (degree)
Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance (degree)
Doctorate Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 6+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 8+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience. (degree)
3+ years technical experience working with large-scale cloud or distributed systems. (degree)
1+ year(s) people management experience (degree)
Responsibilities
Lead Reliability StrategyDrive the vision and execution of reliability, performance, and security across critical systems and services. Influence product design and engineering decisions to ensure resilient, scalable infrastructure.
Build and Scale AutomationChampion intelligent automation (AI/ML-powered) for monitoring, deployment, and incident response to reduce manual overhead and accelerate safe delivery.
Drive Operational ExcellenceUse telemetry and service-level data to guide improvements in availability, efficiency, and cost. Lead post-incident reviews and service improvement plans that restore customer trust and drive long-term resilience.
Foster Engineering PartnershipsCollaborate deeply with product engineering and security teams from early development through production to align on reliability goals and prevent recurrence of issues.
Grow and Empower TeamsAttract, mentor, and develop high-performing SRE talent. Create a culture of inclusion, learning, and accountability that supports career growth and innovation.
Shape Technical DirectionGuide architecture and tooling decisions across distributed systems and cloud infrastructure. Promote adoption of best practices and scalable solutions across teams.