Resume and JobRESUME AND JOB
OpenAI logo

Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!

full-timePosted: Feb 10, 2026

Job Description

Research Engineer, Frontier Evals & Environments at OpenAI - San Francisco

Join OpenAI's Frontier Evals & Environments team and shape the future of safe AGI development. This senior-level Research Engineer role offers you the chance to build north-star evaluation environments that drive progress toward artificial general intelligence (AGI) and artificial superintelligence (ASI). Located in San Francisco, California, this position places you at the forefront of AI safety research.

Role Overview

The Frontier Evals & Environments team at OpenAI is responsible for creating ambitious benchmarks and evaluation frameworks that measure and steer frontier AI models. Our open-sourced evaluations like GDPval, SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer have set industry standards. We've conducted frontier evaluations for groundbreaking models including GPT-4o, o1, o3, GPT-4.5, ChatGPT Agent, and GPT-5.

As a Research Engineer, you'll push the boundaries of AI capabilities measurement, owning end-to-end projects that influence training, safety, and launch decisions. This role demands exceptional engineering talent passionate about AGI safety and rapid model progress. Experience firsthand how OpenAI's models evolve and contribute to steering them responsibly.

Working in our dynamic San Francisco office, you'll collaborate with top researchers to build self-improvement loops, RL environments, and scalable evaluation systems. This is your opportunity to make history in AI safety research.

Key Responsibilities

  • Create Ambitious RL Environments: Design reinforcement learning environments that rigorously test frontier models' limits and capabilities.
  • Measure Model Capabilities: Develop comprehensive frameworks to evaluate skills, behaviors, and emergent abilities in cutting-edge AI systems.
  • Innovate Evaluation Methodologies: Pioneer automatic exploration techniques to uncover hidden model behaviors and failure modes.
  • Steer Frontier Training: Influence training decisions for OpenAI's largest model runs, gaining early access to breakthrough capabilities.
  • Build Scalable Systems: Architect processes supporting continuous, high-throughput model evaluation at scale.
  • Implement Self-Improvement Loops: Create automated systems that enhance model understanding and iterative improvement.
  • Conduct Red-Teaming: Systematically identify vulnerabilities using creative, adversarial testing approaches.
  • Analyze Evaluation Data: Apply statistical rigor to interpret results and inform critical safety decisions.
  • Cross-Functional Collaboration: Partner with research, safety, and product teams to operationalize evaluations.
  • Open-Source Contributions: Publish frameworks that advance the broader AI safety ecosystem.
  • Monitor Production Systems: Implement observability for real-world model deployments.
  • Prototype Novel Benchmarks: Rapidly iterate on evaluation environments like SWE-bench and MLE-bench.
  • Drive Empirical Research: Lead studies spanning the full spectrum of AI capabilities measurement.

Qualifications

Required:

  • Deep passion for AGI/ASI measurement and AI safety research
  • Exceptional engineering skills with ML research engineering experience
  • Strong statistical analysis and experimental design capabilities
  • Creative problem-solving with robust red-teaming mindset
  • Hands-on experience in ML research engineering, stochastic systems, LLM applications, or AI evaluations
  • Proven ability to deliver end-to-end projects in fast-paced environments

Preferred:

  • First-hand red-teaming experience with complex systems
  • Cross-functional collaboration success
  • Excellent technical communication skills

Candidates should thrive in ambiguity, demonstrate ownership, and possess the technical depth to tackle frontier AI challenges.

Salary & Benefits

Competitive Compensation: Total compensation for this senior Research Engineer role ranges from $320,000 - $480,000 USD annually, including base salary, equity, and performance bonuses. Exact figures depend on experience and qualifications.

Comprehensive Benefits Package:

  • Premium health, dental, vision coverage
  • 401(k) with generous company match
  • Unlimited PTO policy
  • Parental leave and family planning support
  • Relocation package for San Francisco
  • Weekly catered meals, snacks, and beverages
  • Onsite fitness center and wellness programs
  • Professional development stipend
  • Cutting-edge hardware and AI infrastructure
  • Equity in OpenAI - shape the future of AI

Why Join OpenAI?

OpenAI leads the world in developing safe AGI that benefits humanity. Our Frontier Evals team directly influences model training for GPT-4o, o1, GPT-5, and beyond. You'll work alongside brilliant minds, publish influential research, and see your evaluations shape AI's trajectory.

San Francisco location offers unparalleled access to AI talent and innovation ecosystem. Experience rapid model progress firsthand and contribute to humanity's most important technical challenge. OpenAI provides resources, autonomy, and impact unmatched in industry.

Join a mission-driven culture prioritizing safety, rapid iteration, and bold ambition. Your work will define AI evaluation standards for generations.

How to Apply

Ready to push AI safety frontiers? Submit your resume, GitHub/portfolio, and a brief note explaining your fit for this Research Engineer role. Highlight relevant ML research, evaluation experience, or red-teaming projects.

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Applications reviewed on rolling basis - apply immediately to join our frontier team.

This San Francisco Research Engineer position represents a rare opportunity to work on AGI safety at the world's leading AI lab. Apply now and help steer humanity's AI future.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

336,000 - 528,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Reinforcement Learning (RL)intermediate
  • Machine Learning Researchintermediate
  • AI Model Evaluationintermediate
  • LLM Applicationsintermediate
  • Statistical Analysisintermediate
  • Red-Teamingintermediate
  • Stochastic Systemsintermediate
  • Observability & Monitoringintermediate
  • Python Programmingintermediate
  • Scalable Systems Designintermediate
  • AGI/ASI Measurementintermediate
  • RL Environment Developmentintermediate
  • Model Capabilities Testingintermediate
  • Automated Evaluation Methodologiesintermediate
  • Cross-Functional Collaborationintermediate
  • Frontier Model Trainingintermediate
  • Self-Improvement Loopsintermediate
  • High-Performance Computingintermediate

Required Qualifications

  • Passionate about AGI/ASI measurement and safety (experience)
  • Strong engineering skills with proven ML research experience (experience)
  • Expertise in statistical analysis and data interpretation (experience)
  • Red-teaming mindset with creative problem-solving abilities (experience)
  • Experience in ML research engineering or related technical domains (experience)
  • Proficiency in stochastic systems and probabilistic modeling (experience)
  • Hands-on experience with observability, monitoring, and debugging complex systems (experience)
  • Deep knowledge of LLM-enabled applications and AI evaluations (experience)
  • Ability to thrive in dynamic, fast-paced research environments (experience)
  • Proven track record of scoping and delivering end-to-end projects (experience)
  • Experience working with frontier AI models and large-scale training runs (experience)
  • Strong communication skills for cross-functional collaboration (preferred) (experience)

Responsibilities

  • Design and create ambitious RL environments to test frontier model limits
  • Develop comprehensive measurement frameworks for model capabilities, skills, and behaviors
  • Innovate new methodologies for automatic exploration of model behaviors
  • Contribute to steering training decisions for largest-scale model training runs
  • Build scalable systems and processes for continuous model evaluation
  • Implement self-improvement loops to automate model understanding and optimization
  • Conduct red-teaming exercises to identify model weaknesses and failure modes
  • Analyze evaluation results to inform safety, training, and deployment decisions
  • Collaborate with research, safety, and engineering teams on frontier evaluations
  • Open-source evaluation frameworks and contribute to public AI safety benchmarks
  • Monitor and observe model performance in production-like environments
  • Prototype novel evaluation environments like GDPval, SWE-bench, and MLE-bench
  • Drive empirical research on the full spectrum of AI capabilities measurement

Benefits

  • general: Competitive salary with equity in leading AI company
  • general: Comprehensive health, dental, and vision insurance
  • general: 401(k) matching and retirement planning support
  • general: Unlimited PTO with encouraged recharge periods
  • general: Generous parental leave policies
  • general: Relocation assistance for San Francisco move
  • general: Weekly catered meals and fully stocked kitchens
  • general: Onsite gym membership and wellness programs
  • general: Learning stipend for conferences and courses
  • general: Direct impact on AGI safety and development
  • general: Work with world-class researchers and engineers
  • general: Latest hardware and cutting-edge AI infrastructure
  • general: Flexible work hours in fast-paced environment
  • general: Opportunities to publish groundbreaking research

Target Your Resume for "Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

OpenAI Research Engineer jobsFrontier Evals careers San FranciscoAGI evaluation engineerRL environments AI jobsAI safety research positionsML research engineering OpenAIFrontier model evaluation careersRed-teaming AI specialistSWE-bench research engineerGPT model evaluation jobsSan Francisco AI research jobsASI measurement engineerLLM evaluation specialistOpenAI San Francisco careersAI capabilities testing jobsStochastic systems ML engineerSelf-improvement loops AIFrontier AI training researchOpenAI eval engineer salaryResearch Engineer AGI safetyModel evaluation frameworks jobsPython ML research OpenAIResearch

Answer 10 quick questions to check your fit for Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

OpenAI logo

Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!

full-timePosted: Feb 10, 2026

Job Description

Research Engineer, Frontier Evals & Environments at OpenAI - San Francisco

Join OpenAI's Frontier Evals & Environments team and shape the future of safe AGI development. This senior-level Research Engineer role offers you the chance to build north-star evaluation environments that drive progress toward artificial general intelligence (AGI) and artificial superintelligence (ASI). Located in San Francisco, California, this position places you at the forefront of AI safety research.

Role Overview

The Frontier Evals & Environments team at OpenAI is responsible for creating ambitious benchmarks and evaluation frameworks that measure and steer frontier AI models. Our open-sourced evaluations like GDPval, SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer have set industry standards. We've conducted frontier evaluations for groundbreaking models including GPT-4o, o1, o3, GPT-4.5, ChatGPT Agent, and GPT-5.

As a Research Engineer, you'll push the boundaries of AI capabilities measurement, owning end-to-end projects that influence training, safety, and launch decisions. This role demands exceptional engineering talent passionate about AGI safety and rapid model progress. Experience firsthand how OpenAI's models evolve and contribute to steering them responsibly.

Working in our dynamic San Francisco office, you'll collaborate with top researchers to build self-improvement loops, RL environments, and scalable evaluation systems. This is your opportunity to make history in AI safety research.

Key Responsibilities

  • Create Ambitious RL Environments: Design reinforcement learning environments that rigorously test frontier models' limits and capabilities.
  • Measure Model Capabilities: Develop comprehensive frameworks to evaluate skills, behaviors, and emergent abilities in cutting-edge AI systems.
  • Innovate Evaluation Methodologies: Pioneer automatic exploration techniques to uncover hidden model behaviors and failure modes.
  • Steer Frontier Training: Influence training decisions for OpenAI's largest model runs, gaining early access to breakthrough capabilities.
  • Build Scalable Systems: Architect processes supporting continuous, high-throughput model evaluation at scale.
  • Implement Self-Improvement Loops: Create automated systems that enhance model understanding and iterative improvement.
  • Conduct Red-Teaming: Systematically identify vulnerabilities using creative, adversarial testing approaches.
  • Analyze Evaluation Data: Apply statistical rigor to interpret results and inform critical safety decisions.
  • Cross-Functional Collaboration: Partner with research, safety, and product teams to operationalize evaluations.
  • Open-Source Contributions: Publish frameworks that advance the broader AI safety ecosystem.
  • Monitor Production Systems: Implement observability for real-world model deployments.
  • Prototype Novel Benchmarks: Rapidly iterate on evaluation environments like SWE-bench and MLE-bench.
  • Drive Empirical Research: Lead studies spanning the full spectrum of AI capabilities measurement.

Qualifications

Required:

  • Deep passion for AGI/ASI measurement and AI safety research
  • Exceptional engineering skills with ML research engineering experience
  • Strong statistical analysis and experimental design capabilities
  • Creative problem-solving with robust red-teaming mindset
  • Hands-on experience in ML research engineering, stochastic systems, LLM applications, or AI evaluations
  • Proven ability to deliver end-to-end projects in fast-paced environments

Preferred:

  • First-hand red-teaming experience with complex systems
  • Cross-functional collaboration success
  • Excellent technical communication skills

Candidates should thrive in ambiguity, demonstrate ownership, and possess the technical depth to tackle frontier AI challenges.

Salary & Benefits

Competitive Compensation: Total compensation for this senior Research Engineer role ranges from $320,000 - $480,000 USD annually, including base salary, equity, and performance bonuses. Exact figures depend on experience and qualifications.

Comprehensive Benefits Package:

  • Premium health, dental, vision coverage
  • 401(k) with generous company match
  • Unlimited PTO policy
  • Parental leave and family planning support
  • Relocation package for San Francisco
  • Weekly catered meals, snacks, and beverages
  • Onsite fitness center and wellness programs
  • Professional development stipend
  • Cutting-edge hardware and AI infrastructure
  • Equity in OpenAI - shape the future of AI

Why Join OpenAI?

OpenAI leads the world in developing safe AGI that benefits humanity. Our Frontier Evals team directly influences model training for GPT-4o, o1, GPT-5, and beyond. You'll work alongside brilliant minds, publish influential research, and see your evaluations shape AI's trajectory.

San Francisco location offers unparalleled access to AI talent and innovation ecosystem. Experience rapid model progress firsthand and contribute to humanity's most important technical challenge. OpenAI provides resources, autonomy, and impact unmatched in industry.

Join a mission-driven culture prioritizing safety, rapid iteration, and bold ambition. Your work will define AI evaluation standards for generations.

How to Apply

Ready to push AI safety frontiers? Submit your resume, GitHub/portfolio, and a brief note explaining your fit for this Research Engineer role. Highlight relevant ML research, evaluation experience, or red-teaming projects.

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Applications reviewed on rolling basis - apply immediately to join our frontier team.

This San Francisco Research Engineer position represents a rare opportunity to work on AGI safety at the world's leading AI lab. Apply now and help steer humanity's AI future.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

336,000 - 528,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Reinforcement Learning (RL)intermediate
  • Machine Learning Researchintermediate
  • AI Model Evaluationintermediate
  • LLM Applicationsintermediate
  • Statistical Analysisintermediate
  • Red-Teamingintermediate
  • Stochastic Systemsintermediate
  • Observability & Monitoringintermediate
  • Python Programmingintermediate
  • Scalable Systems Designintermediate
  • AGI/ASI Measurementintermediate
  • RL Environment Developmentintermediate
  • Model Capabilities Testingintermediate
  • Automated Evaluation Methodologiesintermediate
  • Cross-Functional Collaborationintermediate
  • Frontier Model Trainingintermediate
  • Self-Improvement Loopsintermediate
  • High-Performance Computingintermediate

Required Qualifications

  • Passionate about AGI/ASI measurement and safety (experience)
  • Strong engineering skills with proven ML research experience (experience)
  • Expertise in statistical analysis and data interpretation (experience)
  • Red-teaming mindset with creative problem-solving abilities (experience)
  • Experience in ML research engineering or related technical domains (experience)
  • Proficiency in stochastic systems and probabilistic modeling (experience)
  • Hands-on experience with observability, monitoring, and debugging complex systems (experience)
  • Deep knowledge of LLM-enabled applications and AI evaluations (experience)
  • Ability to thrive in dynamic, fast-paced research environments (experience)
  • Proven track record of scoping and delivering end-to-end projects (experience)
  • Experience working with frontier AI models and large-scale training runs (experience)
  • Strong communication skills for cross-functional collaboration (preferred) (experience)

Responsibilities

  • Design and create ambitious RL environments to test frontier model limits
  • Develop comprehensive measurement frameworks for model capabilities, skills, and behaviors
  • Innovate new methodologies for automatic exploration of model behaviors
  • Contribute to steering training decisions for largest-scale model training runs
  • Build scalable systems and processes for continuous model evaluation
  • Implement self-improvement loops to automate model understanding and optimization
  • Conduct red-teaming exercises to identify model weaknesses and failure modes
  • Analyze evaluation results to inform safety, training, and deployment decisions
  • Collaborate with research, safety, and engineering teams on frontier evaluations
  • Open-source evaluation frameworks and contribute to public AI safety benchmarks
  • Monitor and observe model performance in production-like environments
  • Prototype novel evaluation environments like GDPval, SWE-bench, and MLE-bench
  • Drive empirical research on the full spectrum of AI capabilities measurement

Benefits

  • general: Competitive salary with equity in leading AI company
  • general: Comprehensive health, dental, and vision insurance
  • general: 401(k) matching and retirement planning support
  • general: Unlimited PTO with encouraged recharge periods
  • general: Generous parental leave policies
  • general: Relocation assistance for San Francisco move
  • general: Weekly catered meals and fully stocked kitchens
  • general: Onsite gym membership and wellness programs
  • general: Learning stipend for conferences and courses
  • general: Direct impact on AGI safety and development
  • general: Work with world-class researchers and engineers
  • general: Latest hardware and cutting-edge AI infrastructure
  • general: Flexible work hours in fast-paced environment
  • general: Opportunities to publish groundbreaking research

Target Your Resume for "Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

OpenAI Research Engineer jobsFrontier Evals careers San FranciscoAGI evaluation engineerRL environments AI jobsAI safety research positionsML research engineering OpenAIFrontier model evaluation careersRed-teaming AI specialistSWE-bench research engineerGPT model evaluation jobsSan Francisco AI research jobsASI measurement engineerLLM evaluation specialistOpenAI San Francisco careersAI capabilities testing jobsStochastic systems ML engineerSelf-improvement loops AIFrontier AI training researchOpenAI eval engineer salaryResearch Engineer AGI safetyModel evaluation frameworks jobsPython ML research OpenAIResearch

Answer 10 quick questions to check your fit for Research Engineer, Frontier Evals & Environments Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.