Resume and JobRESUME AND JOB
Datadog logo

Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!

Datadog

Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Senior AI Engineer - APM Experiences at Datadog

Role Overview

Datadog's APM Experiences team is at the forefront of revolutionizing Application Performance Monitoring through cutting-edge AI-powered capabilities. We're seeking a Senior AI Engineer to lead the development of LLM-based and agentic features that empower developers to detect, resolve, and prevent performance issues faster than ever before.

In this high-impact role based in New York, New York, you'll own end-to-end development of intelligent systems that analyze distributed traces, metrics, logs, and telemetry data to deliver root-cause analysis, proactive optimizations, and automated SLO monitoring. This is a product-minded engineering position where you'll shape user experiences from problem discovery through scalable production deployment.

Work on groundbreaking features like autonomous debugging agents, intelligent performance recommendations, and auto-generated monitors for critical business flows. Leverage Datadog's world-class observability platform to build tools that software engineers everywhere will rely on daily.

Key Responsibilities at Datadog

  • Shape AI experiences for APM: Design LLM/agentic workflows that analyze traces, metrics, logs, and telemetry to generate actionable diagnoses, explanations, and guided remediation steps.
  • Own the full development loop: Rapid prototyping, success metric definition, experimentation, iteration, and productionization of AI features at massive scale.
  • Build robust agent systems: Develop specialized tools, retrieval and planning strategies, comprehensive guardrails, prompt management, evaluation frameworks, and sophisticated fallback mechanisms.
  • Integrate deeply with Datadog platform: Connect AI capabilities to core surfaces like Trace Explorer, Service Catalog, intelligent monitors, and workflow automation within the APM UI.
  • Partner cross-functionally: Collaborate closely with Product Managers, Designers, and partner engineering teams to deliver cohesive, delightful user experiences.
  • Elevate engineering standards: Write high-performance, maintainable backend services; own production systems; enhance reliability for high-throughput, low-latency observability pipelines.
  • Drive proactive reliability: Build features that automatically recommend performance optimizations and prevent incidents before they impact customers.
  • Establish AI evaluation rigor: Create reliable golden datasets, offline/online evaluation frameworks, and automatic regression testing for production AI systems.
  • Leverage observability expertise: Apply deep knowledge of distributed tracing, service dependencies, and performance patterns to build truly intelligent APM experiences.

Qualifications & Requirements

Product-minded engineer who ships AI to production:

  • 4+ years experience building backend systems or real-time ML infrastructure emphasizing simplicity, correctness, and performance
  • Proven track record delivering LLM/agent features to production (prompt engineering, tooling, comprehensive evals, safety/guardrails)
  • Experience owning complete user journeys from prototype through alpha testing to general availability, backed by clear product metrics

Strong ML/applied science fundamentals:

  • Deep understanding of complete ML lifecycle (task definition, data collection, modeling, evaluation, deployment, iteration) and statistical rigor
  • Proven ability to select optimal techniques (anomaly detection, ranking, NLP) and recognize when heuristics outperform complex models
  • Expertise building reliable offline/online evaluations for AI systems with golden datasets and automated regression testing

Distributed systems & observability expertise:

  • Hands-on experience with microservices performance: distributed tracing, latency analysis, concurrency patterns, resiliency strategies
  • Production proficiency in Go, Java, or Python; strong API/service architecture; operational maturity (monitoring, alerting, on-call)

Nice to haves: Experience with OpenTelemetry/Datadog APM, distributed tracing stacks, RAG systems for observability data, SLO/SLA practices.

Salary & Benefits

Datadog offers competitive compensation for Senior AI Engineers in New York, typically ranging from $185,000 - $250,000 base salary plus significant equity, variable performance bonuses, and comprehensive benefits. Actual offers consider experience, skills, and market factors.

  • Competitive base salary + performance bonuses
  • Significant equity package with long-term value creation
  • Comprehensive healthcare including family coverage, dental, vision
  • 401(k) with company match for retirement planning
  • Generous PTO and flexible vacation policy
  • Fitness reimbursements and wellness programs
  • Parental leave and family planning benefits
  • Mental health support including counseling services
  • Professional development stipend and learning opportunities
  • Commuter benefits and transportation allowances

Why Join Datadog?

Build tools for software engineers, by software engineers. Work on APM Experiences that millions of developers rely on daily. Enjoy massive influence on product direction and direct business impact. Collaborate with skilled, kind teammates who prioritize teaching and learning.

Datadog's observability platform processes trillions of data points daily, giving you unparalleled scale to build sophisticated AI systems. Join a company that's redefining cloud monitoring through intelligent automation and agentic workflows.

How to Apply

Ready to shape the future of AI-powered APM? Apply now for the Senior AI Engineer position in New York, NY. Include your resume, GitHub/portfolio, and specific examples of LLM/agent projects you've shipped to production. We're moving quickly to hire exceptional talent who can hit the ground running building production-grade AI for Datadog APM.

Datadog is an Equal Opportunity Employer committed to diversity and inclusion.

Locations

  • New York, New York, United States

Salary

Estimated Salary Rangehigh confidence

194,250 - 275,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Application Performance Monitoringintermediate
  • Distributed Tracingintermediate
  • LLM Agent Developmentintermediate
  • Observability Engineeringintermediate
  • Cloud Native Monitoringintermediate
  • Microservices Performanceintermediate
  • Go Programmingintermediate
  • Python Developmentintermediate
  • Machine Learning Operationsintermediate
  • OpenTelemetryintermediate
  • SLO Engineeringintermediate
  • Incident Responseintermediate
  • RAG Systemsintermediate
  • Prompt Engineeringintermediate
  • DevOps Automationintermediate

Required Qualifications

  • 4+ years building backend or real-time ML systems with focus on simplicity, correctness, and performance (experience)
  • Proven experience delivering LLM/agent features to production including prompting, tooling, evals, and safety guardrails (experience)
  • Solid grasp of ML lifecycle from task definition through deployment and iteration, plus statistics expertise (experience)
  • Experience with microservices performance including tracing, latency breakdowns, concurrency, and resiliency patterns (experience)
  • Proficiency in Go, Java, or Python with strong API/service design and production operations experience (experience)
  • Comfortable owning complete user journeys from prototype to general availability with clear product metrics (experience)

Responsibilities

  • Design and ship LLM/agentic workflows that analyze traces, metrics, and logs to generate performance diagnoses
  • Prototype quickly, define success metrics and evaluations, run experiments, and productionize AI features
  • Build robust agent systems with tools, retrieval strategies, planning, guardrails, and human-in-the-loop paths
  • Integrate AI capabilities with Datadog platforms like Trace Explorer, Service Catalog, and APM workflows
  • Collaborate with Product Managers, Designers, and engineering teams to create cohesive APM experiences
  • Write performant, maintainable backend code and own services in high-throughput production environments
  • Develop proactive performance optimization recommendations to prevent incidents before they occur
  • Create intelligent monitors and SLOs automatically for critical business flows and application paths
  • Improve reliability of low-latency data systems handling distributed tracing and observability telemetry

Benefits

  • general: Competitive salary and significant equity package
  • general: Comprehensive healthcare including family coverage
  • general: Dental and vision insurance plans
  • general: 401(k) plan with generous company match
  • general: Paid time off and flexible vacation policy
  • general: Fitness reimbursements and wellness programs
  • general: Parental planning and family leave benefits
  • general: Mental health support and counseling services
  • general: Professional development stipend and training
  • general: Commuter benefits and transportation allowances

Target Your Resume for "Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

AI EngineeringAPMDistributed TracingLLM DevelopmentObservabilityCloud NativeDevOpsMachine LearningBackend EngineeringNew York Tech JobsSenior AI Engineer DatadogAPM Experiences careersLLM agent development jobsDistributed tracing engineerApplication Performance Monitoring AIObservability AI engineer NYCDatadog APM careers New YorkProduction ML engineer jobsOpenTelemetry AI developerSLO engineering positionsCloud monitoring AI careersMicroservices performance engineerRAG systems observabilityPrompt engineering production jobsDatadog New York engineeringAutonomous debugging agent developerAI observability platform careersDev Eng

Answer 10 quick questions to check your fit for Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Datadog logo

Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!

Datadog

Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!

full-timePosted: Jan 21, 2026

Job Description

Senior AI Engineer - APM Experiences at Datadog

Role Overview

Datadog's APM Experiences team is at the forefront of revolutionizing Application Performance Monitoring through cutting-edge AI-powered capabilities. We're seeking a Senior AI Engineer to lead the development of LLM-based and agentic features that empower developers to detect, resolve, and prevent performance issues faster than ever before.

In this high-impact role based in New York, New York, you'll own end-to-end development of intelligent systems that analyze distributed traces, metrics, logs, and telemetry data to deliver root-cause analysis, proactive optimizations, and automated SLO monitoring. This is a product-minded engineering position where you'll shape user experiences from problem discovery through scalable production deployment.

Work on groundbreaking features like autonomous debugging agents, intelligent performance recommendations, and auto-generated monitors for critical business flows. Leverage Datadog's world-class observability platform to build tools that software engineers everywhere will rely on daily.

Key Responsibilities at Datadog

  • Shape AI experiences for APM: Design LLM/agentic workflows that analyze traces, metrics, logs, and telemetry to generate actionable diagnoses, explanations, and guided remediation steps.
  • Own the full development loop: Rapid prototyping, success metric definition, experimentation, iteration, and productionization of AI features at massive scale.
  • Build robust agent systems: Develop specialized tools, retrieval and planning strategies, comprehensive guardrails, prompt management, evaluation frameworks, and sophisticated fallback mechanisms.
  • Integrate deeply with Datadog platform: Connect AI capabilities to core surfaces like Trace Explorer, Service Catalog, intelligent monitors, and workflow automation within the APM UI.
  • Partner cross-functionally: Collaborate closely with Product Managers, Designers, and partner engineering teams to deliver cohesive, delightful user experiences.
  • Elevate engineering standards: Write high-performance, maintainable backend services; own production systems; enhance reliability for high-throughput, low-latency observability pipelines.
  • Drive proactive reliability: Build features that automatically recommend performance optimizations and prevent incidents before they impact customers.
  • Establish AI evaluation rigor: Create reliable golden datasets, offline/online evaluation frameworks, and automatic regression testing for production AI systems.
  • Leverage observability expertise: Apply deep knowledge of distributed tracing, service dependencies, and performance patterns to build truly intelligent APM experiences.

Qualifications & Requirements

Product-minded engineer who ships AI to production:

  • 4+ years experience building backend systems or real-time ML infrastructure emphasizing simplicity, correctness, and performance
  • Proven track record delivering LLM/agent features to production (prompt engineering, tooling, comprehensive evals, safety/guardrails)
  • Experience owning complete user journeys from prototype through alpha testing to general availability, backed by clear product metrics

Strong ML/applied science fundamentals:

  • Deep understanding of complete ML lifecycle (task definition, data collection, modeling, evaluation, deployment, iteration) and statistical rigor
  • Proven ability to select optimal techniques (anomaly detection, ranking, NLP) and recognize when heuristics outperform complex models
  • Expertise building reliable offline/online evaluations for AI systems with golden datasets and automated regression testing

Distributed systems & observability expertise:

  • Hands-on experience with microservices performance: distributed tracing, latency analysis, concurrency patterns, resiliency strategies
  • Production proficiency in Go, Java, or Python; strong API/service architecture; operational maturity (monitoring, alerting, on-call)

Nice to haves: Experience with OpenTelemetry/Datadog APM, distributed tracing stacks, RAG systems for observability data, SLO/SLA practices.

Salary & Benefits

Datadog offers competitive compensation for Senior AI Engineers in New York, typically ranging from $185,000 - $250,000 base salary plus significant equity, variable performance bonuses, and comprehensive benefits. Actual offers consider experience, skills, and market factors.

  • Competitive base salary + performance bonuses
  • Significant equity package with long-term value creation
  • Comprehensive healthcare including family coverage, dental, vision
  • 401(k) with company match for retirement planning
  • Generous PTO and flexible vacation policy
  • Fitness reimbursements and wellness programs
  • Parental leave and family planning benefits
  • Mental health support including counseling services
  • Professional development stipend and learning opportunities
  • Commuter benefits and transportation allowances

Why Join Datadog?

Build tools for software engineers, by software engineers. Work on APM Experiences that millions of developers rely on daily. Enjoy massive influence on product direction and direct business impact. Collaborate with skilled, kind teammates who prioritize teaching and learning.

Datadog's observability platform processes trillions of data points daily, giving you unparalleled scale to build sophisticated AI systems. Join a company that's redefining cloud monitoring through intelligent automation and agentic workflows.

How to Apply

Ready to shape the future of AI-powered APM? Apply now for the Senior AI Engineer position in New York, NY. Include your resume, GitHub/portfolio, and specific examples of LLM/agent projects you've shipped to production. We're moving quickly to hire exceptional talent who can hit the ground running building production-grade AI for Datadog APM.

Datadog is an Equal Opportunity Employer committed to diversity and inclusion.

Locations

  • New York, New York, United States

Salary

Estimated Salary Rangehigh confidence

194,250 - 275,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Application Performance Monitoringintermediate
  • Distributed Tracingintermediate
  • LLM Agent Developmentintermediate
  • Observability Engineeringintermediate
  • Cloud Native Monitoringintermediate
  • Microservices Performanceintermediate
  • Go Programmingintermediate
  • Python Developmentintermediate
  • Machine Learning Operationsintermediate
  • OpenTelemetryintermediate
  • SLO Engineeringintermediate
  • Incident Responseintermediate
  • RAG Systemsintermediate
  • Prompt Engineeringintermediate
  • DevOps Automationintermediate

Required Qualifications

  • 4+ years building backend or real-time ML systems with focus on simplicity, correctness, and performance (experience)
  • Proven experience delivering LLM/agent features to production including prompting, tooling, evals, and safety guardrails (experience)
  • Solid grasp of ML lifecycle from task definition through deployment and iteration, plus statistics expertise (experience)
  • Experience with microservices performance including tracing, latency breakdowns, concurrency, and resiliency patterns (experience)
  • Proficiency in Go, Java, or Python with strong API/service design and production operations experience (experience)
  • Comfortable owning complete user journeys from prototype to general availability with clear product metrics (experience)

Responsibilities

  • Design and ship LLM/agentic workflows that analyze traces, metrics, and logs to generate performance diagnoses
  • Prototype quickly, define success metrics and evaluations, run experiments, and productionize AI features
  • Build robust agent systems with tools, retrieval strategies, planning, guardrails, and human-in-the-loop paths
  • Integrate AI capabilities with Datadog platforms like Trace Explorer, Service Catalog, and APM workflows
  • Collaborate with Product Managers, Designers, and engineering teams to create cohesive APM experiences
  • Write performant, maintainable backend code and own services in high-throughput production environments
  • Develop proactive performance optimization recommendations to prevent incidents before they occur
  • Create intelligent monitors and SLOs automatically for critical business flows and application paths
  • Improve reliability of low-latency data systems handling distributed tracing and observability telemetry

Benefits

  • general: Competitive salary and significant equity package
  • general: Comprehensive healthcare including family coverage
  • general: Dental and vision insurance plans
  • general: 401(k) plan with generous company match
  • general: Paid time off and flexible vacation policy
  • general: Fitness reimbursements and wellness programs
  • general: Parental planning and family leave benefits
  • general: Mental health support and counseling services
  • general: Professional development stipend and training
  • general: Commuter benefits and transportation allowances

Target Your Resume for "Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!" , Datadog

Get personalized recommendations to optimize your resume specifically for Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now!" , Datadog

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

AI EngineeringAPMDistributed TracingLLM DevelopmentObservabilityCloud NativeDevOpsMachine LearningBackend EngineeringNew York Tech JobsSenior AI Engineer DatadogAPM Experiences careersLLM agent development jobsDistributed tracing engineerApplication Performance Monitoring AIObservability AI engineer NYCDatadog APM careers New YorkProduction ML engineer jobsOpenTelemetry AI developerSLO engineering positionsCloud monitoring AI careersMicroservices performance engineerRAG systems observabilityPrompt engineering production jobsDatadog New York engineeringAutonomous debugging agent developerAI observability platform careersDev Eng

Answer 10 quick questions to check your fit for Senior AI Engineer - APM Experiences Careers at Datadog - New York, New York | Apply Now! @ Datadog.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.