Resume and JobRESUME AND JOB
Datadog logo

Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!

Datadog

Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Role Overview

Datadog's ML Observability team is at the forefront of building cutting-edge tools to monitor, explain, and improve AI systems in production. Specializing in Large Language Models (LLMs) and generative AI, we deliver robust, scalable observability solutions including drift detection, model evaluation, and behavior tracing. As a Staff Software Engineer - ML Observability in Boston, Massachusetts, you'll lead development of new features and foundational capabilities within Datadog's LLM Observability product.

Your deep expertise in AI systems and software engineering will shape product direction, drive experimentation, and solve complex problems in the fast-evolving AI landscape. This role directly impacts how customers monitor, troubleshoot, and optimize LLM-based applications at scale. Join us to build the foundational tools making AI systems observable, understandable, and reliable in real-world production environments.

Key Responsibilities at Datadog

  • Drive design and implementation of LLM observability features for production AI workloads
  • Ideate, prototype, and scale innovative product features providing actionable insights for generative AI systems
  • Collaborate cross-functionally with engineering, product, UX, and applied science teams to achieve product-market fit
  • Develop advanced tools for tracing, evaluating, and debugging Large Language Models in production
  • Influence critical architecture decisions and mentor engineers to build resilient, high-performance systems
  • Stay deeply connected to customer pain points to prioritize engineering roadmap
  • Monitor industry trends in machine learning and observability to drive team innovation
  • Lead experimentation initiatives including A/B testing for ML observability capabilities
  • Build scalable solutions for cloud-native AI applications using modern DevOps practices

Qualifications & Requirements

  • Advanced Degree: BS/MS/PhD in Computer Science, Engineering, or related scientific field (or equivalent practical experience)
  • Systems Expertise: Deep understanding of distributed systems and scalable backend architectures
  • AI Experience: Hands-on experience building and shipping LLM-powered or GenAI applications
  • Senior Leadership: 5+ years software engineering with proven track record leading complex, high-impact projects
  • Observability Knowledge: Strong experience with monitoring, tracing, and observability systems
  • Technical Proficiency: Expertise in Python, Go, or similar backend languages; familiarity with Kubernetes, Docker, AWS

Salary & Benefits

Datadog offers competitive compensation for Staff Software Engineers in ML Observability, typically ranging from $220,000 - $320,000 base salary plus equity, bonuses, and comprehensive benefits. Our total compensation packages are designed to attract top talent in Boston's competitive tech market.

  • Competitive Equity: Meaningful stock options package
  • Health Insurance: Comprehensive coverage for family members
  • Commuter Benefits: Full transportation reimbursements
  • Fitness Reimbursements: Gym memberships and wellness programs
  • Professional Development: Learning stipends, conference attendance, certifications
  • Unlimited PTO: Flexible time off policy
  • Hybrid Workplace: Balance office collaboration with remote flexibility
  • 401(k) Matching: Generous retirement savings contributions

Why Join Datadog?

Datadog is the leading cloud observability platform trusted by thousands of innovative companies worldwide. Our ML Observability team is pioneering the future of AI monitoring, giving you the opportunity to work on production-scale LLM systems that power enterprise generative AI applications. Boston's vibrant tech ecosystem combined with Datadog's hybrid culture creates the perfect environment for technical excellence and career growth.

Work with cutting-edge technologies, collaborate with world-class engineers, and directly impact customer success. At Datadog, you'll gain exposure to the entire product lifecycle from ideation to production deployment, while staying ahead of AI industry trends.

How to Apply

Ready to shape the future of ML observability at Datadog? Apply now for the Staff Software Engineer position in Boston, Massachusetts. Submit your resume and tell us why you're passionate about building observability for the AI era. Our team reviews applications continuously - don't miss your chance to join a market leader in cloud monitoring and security.

Datadog is an Equal Opportunity Employer committed to diversity and inclusion.

Locations

  • Boston, Massachusetts, United States

Salary

Estimated Salary Rangehigh confidence

231,000 - 352,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • LLM Observabilityintermediate
  • Machine Learning Engineeringintermediate
  • Distributed Systemsintermediate
  • Cloud Monitoringintermediate
  • Generative AIintermediate
  • Drift Detectionintermediate
  • Model Evaluationintermediate
  • Behavior Tracingintermediate
  • DevOpsintermediate
  • Scalable Backend Architectureintermediate
  • Kubernetesintermediate
  • Dockerintermediate
  • Pythonintermediate
  • Gointermediate
  • AWSintermediate
  • Observability Toolsintermediate

Required Qualifications

  • BS/MS/PhD in Computer Science, Engineering, or related scientific field or equivalent experience (experience)
  • Deep understanding of distributed systems and scalable backend architectures (experience)
  • Hands-on experience building and shipping LLM-powered or GenAI applications (experience)
  • 5+ years of software engineering experience with proven track record of leading complex projects (experience)
  • Strong experience with observability, monitoring, and tracing systems (experience)
  • Proficiency in Python, Go, or similar languages for backend development (experience)

Responsibilities

  • Drive design and implementation of LLM observability features for production AI systems
  • Ideate, prototype, and scale new product features to provide insights for generative AI systems
  • Work cross-functionally with engineering, product, UX, and applied science teams
  • Develop and extend tools for tracing, evaluating, and debugging Large Language Models
  • Influence architecture decisions and mentor engineers on resilient systems
  • Stay close to customer pain points to guide product and engineering priorities
  • Stay current with machine learning trends and drive team innovation
  • Build scalable observability solutions for cloud-native AI workloads
  • Lead experimentation and A/B testing for ML observability features

Benefits

  • general: Competitive equity package with stock options
  • general: Comprehensive health insurance covering family members
  • general: Commuter benefits and transportation reimbursements
  • general: Fitness and wellness program reimbursements
  • general: Professional development budget and conference attendance
  • general: Unlimited PTO and flexible work hours
  • general: Hybrid workplace with work-life harmony
  • general: 401(k) matching and retirement planning
  • general: Parental leave and family support programs
  • general: Learning stipend for certifications and courses

Target Your Resume for "Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

ML ObservabilityLLM EngineeringGenerative AIStaff EngineerDatadog CareersBoston Tech JobsCloud ObservabilityAI MonitoringStaff Software Engineer ML Observability DatadogLLM observability jobs BostonDatadog ML engineer careersGenerative AI monitoring engineerLarge Language Model observabilityAI drift detection jobsCloud AI observability platformStaff engineer generative AI BostonDatadog LLM tracing engineerML production monitoring jobsDistributed systems AI engineerDatadog Boston software engineerAI model evaluation engineerObservability for LLMs careersStaff engineer cloud monitoring AIDatadog hybrid ML jobs MassachusettsDev Eng

Answer 10 quick questions to check your fit for Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Datadog logo

Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!

Datadog

Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Role Overview

Datadog's ML Observability team is at the forefront of building cutting-edge tools to monitor, explain, and improve AI systems in production. Specializing in Large Language Models (LLMs) and generative AI, we deliver robust, scalable observability solutions including drift detection, model evaluation, and behavior tracing. As a Staff Software Engineer - ML Observability in Boston, Massachusetts, you'll lead development of new features and foundational capabilities within Datadog's LLM Observability product.

Your deep expertise in AI systems and software engineering will shape product direction, drive experimentation, and solve complex problems in the fast-evolving AI landscape. This role directly impacts how customers monitor, troubleshoot, and optimize LLM-based applications at scale. Join us to build the foundational tools making AI systems observable, understandable, and reliable in real-world production environments.

Key Responsibilities at Datadog

  • Drive design and implementation of LLM observability features for production AI workloads
  • Ideate, prototype, and scale innovative product features providing actionable insights for generative AI systems
  • Collaborate cross-functionally with engineering, product, UX, and applied science teams to achieve product-market fit
  • Develop advanced tools for tracing, evaluating, and debugging Large Language Models in production
  • Influence critical architecture decisions and mentor engineers to build resilient, high-performance systems
  • Stay deeply connected to customer pain points to prioritize engineering roadmap
  • Monitor industry trends in machine learning and observability to drive team innovation
  • Lead experimentation initiatives including A/B testing for ML observability capabilities
  • Build scalable solutions for cloud-native AI applications using modern DevOps practices

Qualifications & Requirements

  • Advanced Degree: BS/MS/PhD in Computer Science, Engineering, or related scientific field (or equivalent practical experience)
  • Systems Expertise: Deep understanding of distributed systems and scalable backend architectures
  • AI Experience: Hands-on experience building and shipping LLM-powered or GenAI applications
  • Senior Leadership: 5+ years software engineering with proven track record leading complex, high-impact projects
  • Observability Knowledge: Strong experience with monitoring, tracing, and observability systems
  • Technical Proficiency: Expertise in Python, Go, or similar backend languages; familiarity with Kubernetes, Docker, AWS

Salary & Benefits

Datadog offers competitive compensation for Staff Software Engineers in ML Observability, typically ranging from $220,000 - $320,000 base salary plus equity, bonuses, and comprehensive benefits. Our total compensation packages are designed to attract top talent in Boston's competitive tech market.

  • Competitive Equity: Meaningful stock options package
  • Health Insurance: Comprehensive coverage for family members
  • Commuter Benefits: Full transportation reimbursements
  • Fitness Reimbursements: Gym memberships and wellness programs
  • Professional Development: Learning stipends, conference attendance, certifications
  • Unlimited PTO: Flexible time off policy
  • Hybrid Workplace: Balance office collaboration with remote flexibility
  • 401(k) Matching: Generous retirement savings contributions

Why Join Datadog?

Datadog is the leading cloud observability platform trusted by thousands of innovative companies worldwide. Our ML Observability team is pioneering the future of AI monitoring, giving you the opportunity to work on production-scale LLM systems that power enterprise generative AI applications. Boston's vibrant tech ecosystem combined with Datadog's hybrid culture creates the perfect environment for technical excellence and career growth.

Work with cutting-edge technologies, collaborate with world-class engineers, and directly impact customer success. At Datadog, you'll gain exposure to the entire product lifecycle from ideation to production deployment, while staying ahead of AI industry trends.

How to Apply

Ready to shape the future of ML observability at Datadog? Apply now for the Staff Software Engineer position in Boston, Massachusetts. Submit your resume and tell us why you're passionate about building observability for the AI era. Our team reviews applications continuously - don't miss your chance to join a market leader in cloud monitoring and security.

Datadog is an Equal Opportunity Employer committed to diversity and inclusion.

Locations

  • Boston, Massachusetts, United States

Salary

Estimated Salary Rangehigh confidence

231,000 - 352,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • LLM Observabilityintermediate
  • Machine Learning Engineeringintermediate
  • Distributed Systemsintermediate
  • Cloud Monitoringintermediate
  • Generative AIintermediate
  • Drift Detectionintermediate
  • Model Evaluationintermediate
  • Behavior Tracingintermediate
  • DevOpsintermediate
  • Scalable Backend Architectureintermediate
  • Kubernetesintermediate
  • Dockerintermediate
  • Pythonintermediate
  • Gointermediate
  • AWSintermediate
  • Observability Toolsintermediate

Required Qualifications

  • BS/MS/PhD in Computer Science, Engineering, or related scientific field or equivalent experience (experience)
  • Deep understanding of distributed systems and scalable backend architectures (experience)
  • Hands-on experience building and shipping LLM-powered or GenAI applications (experience)
  • 5+ years of software engineering experience with proven track record of leading complex projects (experience)
  • Strong experience with observability, monitoring, and tracing systems (experience)
  • Proficiency in Python, Go, or similar languages for backend development (experience)

Responsibilities

  • Drive design and implementation of LLM observability features for production AI systems
  • Ideate, prototype, and scale new product features to provide insights for generative AI systems
  • Work cross-functionally with engineering, product, UX, and applied science teams
  • Develop and extend tools for tracing, evaluating, and debugging Large Language Models
  • Influence architecture decisions and mentor engineers on resilient systems
  • Stay close to customer pain points to guide product and engineering priorities
  • Stay current with machine learning trends and drive team innovation
  • Build scalable observability solutions for cloud-native AI workloads
  • Lead experimentation and A/B testing for ML observability features

Benefits

  • general: Competitive equity package with stock options
  • general: Comprehensive health insurance covering family members
  • general: Commuter benefits and transportation reimbursements
  • general: Fitness and wellness program reimbursements
  • general: Professional development budget and conference attendance
  • general: Unlimited PTO and flexible work hours
  • general: Hybrid workplace with work-life harmony
  • general: 401(k) matching and retirement planning
  • general: Parental leave and family support programs
  • general: Learning stipend for certifications and courses

Target Your Resume for "Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

ML ObservabilityLLM EngineeringGenerative AIStaff EngineerDatadog CareersBoston Tech JobsCloud ObservabilityAI MonitoringStaff Software Engineer ML Observability DatadogLLM observability jobs BostonDatadog ML engineer careersGenerative AI monitoring engineerLarge Language Model observabilityAI drift detection jobsCloud AI observability platformStaff engineer generative AI BostonDatadog LLM tracing engineerML production monitoring jobsDistributed systems AI engineerDatadog Boston software engineerAI model evaluation engineerObservability for LLMs careersStaff engineer cloud monitoring AIDatadog hybrid ML jobs MassachusettsDev Eng

Answer 10 quick questions to check your fit for Staff Software Engineer - ML Observability Careers at Datadog - Boston, Massachusetts | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.