Resume and JobRESUME AND JOB
Datadog logo

AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!

Datadog

AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Role Overview

Join Datadog's cutting-edge AI Research (DAIR) team as an AI Research Engineer in New York, New York. We're building the future of cloud observability and security through groundbreaking AI. Partner with world-class research scientists to transform research prototypes into production-ready systems that power Datadog's AI platform.

Building on proven successes like Bits AI for incident management, Watchdog for security observability, and Toto for advanced analytics, our team tackles high-impact challenges in observability foundation models, SRE autonomous agents, and production code repair agents. This is your chance to work on state-of-the-art AI that directly impacts millions of developers and Fortune 500 companies.

Key Responsibilities at Datadog

  • Build and operate ML datasets, training pipelines, evaluation frameworks, and internal tooling for rapid AI iteration
  • Implement foundation models for advanced forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces)
  • Orchestrate distributed training and RL with Ray, managing scheduling, scaling, and failure recovery at massive scale
  • Develop SRE autonomous agents that detect, diagnose, and resolve production incidents automatically
  • Create production code repair agents using code, logs, runtime data to fix performance and security issues
  • Establish automated benchmarks and regression tests for forecasting, agents, and code repair tasks
  • Profile systems for reliability, performance, and cost optimization in production environments
  • Make the entire research stack observable, reproducible, and accessible to engineering teams
  • Collaborate with scientists to create smooth paths from research prototypes to production deployment

Qualifications & Requirements

To succeed in this AI Research Engineer role at Datadog, you'll need:

  • 5+ years in ML engineering, research engineering, or similar roles building production ML systems
  • Expertise in Python, PyTorch, Ray, and distributed training/inference infrastructure
  • Experience with observability data (logs, metrics, traces) and cloud-native architectures
  • Strong systems engineering skills for building reliable ML pipelines at scale
  • Knowledge of SRE practices, incident management, and production debugging workflows
  • Bonus: Experience with LLMs, multi-modal models, or autonomous agents

Salary & Benefits

Datadog offers competitive compensation for AI Research Engineers in New York, including base salary, equity, and performance bonuses. Our comprehensive benefits package includes:

  • Competitive equity with significant long-term upside
  • Family health insurance with low deductibles
  • Commuter benefits covering NYC transit costs
  • Fitness reimbursements up to $1,200/year
  • Professional development stipend for conferences and certifications
  • Unlimited PTO and flexible work policies
  • 401(k) matching and financial planning support
  • Catered meals and wellness programs daily

Why Join Datadog?

Datadog is the leading cloud observability platform trusted by 20,000+ organizations worldwide. Our AI Research team works on high-risk, high-reward projects that ship to production and impact millions of users. You'll collaborate with top researchers while building infrastructure that powers real-world SRE automation and security observability.

Based in New York City, you'll enjoy our vibrant tech culture, competitive compensation, and unparalleled career growth opportunities in AI for DevOps.

How to Apply

Ready to shape the future of cloud observability with AI? Apply now for the AI Research Engineer position at Datadog AI Research (DAIR). Submit your resume and a brief note about your most impactful ML engineering project. We review applications on a rolling basis and move quickly for top talent.

Join us in New York, New York to build the autonomous future of SRE and observability! #DatadogCareers #AIResearchEngineer #CloudObservability

Locations

  • New York, New York, United States

Salary

Estimated Salary Rangehigh confidence

194,250 - 291,500 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • AI/ML Engineeringintermediate
  • Observability Platformsintermediate
  • Cloud Monitoringintermediate
  • Distributed Trainingintermediate
  • Ray Frameworkintermediate
  • Anomaly Detectionintermediate
  • Foundation Modelsintermediate
  • SRE Automationintermediate
  • DevOps Toolingintermediate
  • Telemetry Analysisintermediate
  • Multi-Modal AIintermediate
  • Production Code Repairintermediate
  • RL Agentsintermediate
  • Kubernetes Orchestrationintermediate
  • Python ML Pipelinesintermediate

Required Qualifications

  • 5+ years experience in ML engineering or research engineering roles (experience)
  • Strong proficiency in Python, PyTorch, and distributed training frameworks like Ray (experience)
  • Experience building and operating large-scale ML datasets and evaluation pipelines (experience)
  • Deep knowledge of observability concepts (logs, metrics, traces) and cloud infrastructure (experience)
  • Proven track record scaling ML experiments with reliability, performance, and cost optimization (experience)
  • Familiarity with SRE practices and production incident management workflows (experience)

Responsibilities

  • Build and operate datasets, training, and evaluation pipelines for observability foundation models
  • Implement state-of-the-art models for forecasting, anomaly detection, and multi-modal telemetry analysis
  • Orchestrate distributed training and reinforcement learning with Ray including failure recovery
  • Develop SRE autonomous agents for incident detection, diagnosis, and automated resolution
  • Create production code repair agents leveraging code, logs, and runtime telemetry signals
  • Establish rigorous automated benchmarks and regression tests for AI research tasks
  • Profile ML systems for reliability, performance, and cost efficiency at scale
  • Make research infrastructure observable, reproducible, and developer-friendly
  • Partner with research scientists to productionize prototypes from research to deployment

Benefits

  • general: Competitive equity package with significant growth upside
  • general: Comprehensive health insurance covering family members
  • general: Generous commuter benefits for NYC public transportation
  • general: Fitness and wellness reimbursements up to $1,200 annually
  • general: Professional development stipend for conferences and courses
  • general: Unlimited PTO with encouraged recharge periods
  • general: 401(k) matching program
  • general: Daily catered meals and fully stocked kitchens
  • general: Parental leave with dedicated return-to-work support
  • general: Employee stock purchase plan at discounted rates

Target Your Resume for "AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

AI/MLResearch EngineeringObservabilityCloud NativeSREDevOpsNew York Tech JobsAI Research Engineer DatadogDatadog AI Research careersobservability foundation modelsSRE autonomous agentsproduction code repair AIML engineer New York jobsdistributed training Raycloud observability AIDatadog DAIR careersanomaly detection engineertelemetry analysis modelsSRE AI automation jobsAI DevOps engineer NYCmulti-modal ML observabilityDatadog AI research engineerproduction ML pipelinesRay RL distributed trainingDev Eng

Answer 10 quick questions to check your fit for AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Datadog logo

AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!

Datadog

AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Role Overview

Join Datadog's cutting-edge AI Research (DAIR) team as an AI Research Engineer in New York, New York. We're building the future of cloud observability and security through groundbreaking AI. Partner with world-class research scientists to transform research prototypes into production-ready systems that power Datadog's AI platform.

Building on proven successes like Bits AI for incident management, Watchdog for security observability, and Toto for advanced analytics, our team tackles high-impact challenges in observability foundation models, SRE autonomous agents, and production code repair agents. This is your chance to work on state-of-the-art AI that directly impacts millions of developers and Fortune 500 companies.

Key Responsibilities at Datadog

  • Build and operate ML datasets, training pipelines, evaluation frameworks, and internal tooling for rapid AI iteration
  • Implement foundation models for advanced forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces)
  • Orchestrate distributed training and RL with Ray, managing scheduling, scaling, and failure recovery at massive scale
  • Develop SRE autonomous agents that detect, diagnose, and resolve production incidents automatically
  • Create production code repair agents using code, logs, runtime data to fix performance and security issues
  • Establish automated benchmarks and regression tests for forecasting, agents, and code repair tasks
  • Profile systems for reliability, performance, and cost optimization in production environments
  • Make the entire research stack observable, reproducible, and accessible to engineering teams
  • Collaborate with scientists to create smooth paths from research prototypes to production deployment

Qualifications & Requirements

To succeed in this AI Research Engineer role at Datadog, you'll need:

  • 5+ years in ML engineering, research engineering, or similar roles building production ML systems
  • Expertise in Python, PyTorch, Ray, and distributed training/inference infrastructure
  • Experience with observability data (logs, metrics, traces) and cloud-native architectures
  • Strong systems engineering skills for building reliable ML pipelines at scale
  • Knowledge of SRE practices, incident management, and production debugging workflows
  • Bonus: Experience with LLMs, multi-modal models, or autonomous agents

Salary & Benefits

Datadog offers competitive compensation for AI Research Engineers in New York, including base salary, equity, and performance bonuses. Our comprehensive benefits package includes:

  • Competitive equity with significant long-term upside
  • Family health insurance with low deductibles
  • Commuter benefits covering NYC transit costs
  • Fitness reimbursements up to $1,200/year
  • Professional development stipend for conferences and certifications
  • Unlimited PTO and flexible work policies
  • 401(k) matching and financial planning support
  • Catered meals and wellness programs daily

Why Join Datadog?

Datadog is the leading cloud observability platform trusted by 20,000+ organizations worldwide. Our AI Research team works on high-risk, high-reward projects that ship to production and impact millions of users. You'll collaborate with top researchers while building infrastructure that powers real-world SRE automation and security observability.

Based in New York City, you'll enjoy our vibrant tech culture, competitive compensation, and unparalleled career growth opportunities in AI for DevOps.

How to Apply

Ready to shape the future of cloud observability with AI? Apply now for the AI Research Engineer position at Datadog AI Research (DAIR). Submit your resume and a brief note about your most impactful ML engineering project. We review applications on a rolling basis and move quickly for top talent.

Join us in New York, New York to build the autonomous future of SRE and observability! #DatadogCareers #AIResearchEngineer #CloudObservability

Locations

  • New York, New York, United States

Salary

Estimated Salary Rangehigh confidence

194,250 - 291,500 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • AI/ML Engineeringintermediate
  • Observability Platformsintermediate
  • Cloud Monitoringintermediate
  • Distributed Trainingintermediate
  • Ray Frameworkintermediate
  • Anomaly Detectionintermediate
  • Foundation Modelsintermediate
  • SRE Automationintermediate
  • DevOps Toolingintermediate
  • Telemetry Analysisintermediate
  • Multi-Modal AIintermediate
  • Production Code Repairintermediate
  • RL Agentsintermediate
  • Kubernetes Orchestrationintermediate
  • Python ML Pipelinesintermediate

Required Qualifications

  • 5+ years experience in ML engineering or research engineering roles (experience)
  • Strong proficiency in Python, PyTorch, and distributed training frameworks like Ray (experience)
  • Experience building and operating large-scale ML datasets and evaluation pipelines (experience)
  • Deep knowledge of observability concepts (logs, metrics, traces) and cloud infrastructure (experience)
  • Proven track record scaling ML experiments with reliability, performance, and cost optimization (experience)
  • Familiarity with SRE practices and production incident management workflows (experience)

Responsibilities

  • Build and operate datasets, training, and evaluation pipelines for observability foundation models
  • Implement state-of-the-art models for forecasting, anomaly detection, and multi-modal telemetry analysis
  • Orchestrate distributed training and reinforcement learning with Ray including failure recovery
  • Develop SRE autonomous agents for incident detection, diagnosis, and automated resolution
  • Create production code repair agents leveraging code, logs, and runtime telemetry signals
  • Establish rigorous automated benchmarks and regression tests for AI research tasks
  • Profile ML systems for reliability, performance, and cost efficiency at scale
  • Make research infrastructure observable, reproducible, and developer-friendly
  • Partner with research scientists to productionize prototypes from research to deployment

Benefits

  • general: Competitive equity package with significant growth upside
  • general: Comprehensive health insurance covering family members
  • general: Generous commuter benefits for NYC public transportation
  • general: Fitness and wellness reimbursements up to $1,200 annually
  • general: Professional development stipend for conferences and courses
  • general: Unlimited PTO with encouraged recharge periods
  • general: 401(k) matching program
  • general: Daily catered meals and fully stocked kitchens
  • general: Parental leave with dedicated return-to-work support
  • general: Employee stock purchase plan at discounted rates

Target Your Resume for "AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

AI/MLResearch EngineeringObservabilityCloud NativeSREDevOpsNew York Tech JobsAI Research Engineer DatadogDatadog AI Research careersobservability foundation modelsSRE autonomous agentsproduction code repair AIML engineer New York jobsdistributed training Raycloud observability AIDatadog DAIR careersanomaly detection engineertelemetry analysis modelsSRE AI automation jobsAI DevOps engineer NYCmulti-modal ML observabilityDatadog AI research engineerproduction ML pipelinesRay RL distributed trainingDev Eng

Answer 10 quick questions to check your fit for AI Research Engineer – Datadog AI Research (DAIR) Careers at Datadog - New York, New York | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.