Resume and JobRESUME AND JOB
OpenAI logo

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

full-timePosted: Feb 10, 2026

Job Description

Senior Software Engineer, Data Acquisition at OpenAI - San Francisco, CA

Join OpenAI's Data Acquisition team and shape the future of AI training data infrastructure. As a Senior Software Engineer, you'll lead projects handling petabytes of web-scale data while working on the cutting edge of distributed systems technology.

Role Overview

The Data Acquisition team within OpenAI's Foundations organization powers all aspects of data collection for our groundbreaking model training operations. Managing web crawling, GPTBot services, and massive data pipelines, this team ensures our AI systems have access to the world's knowledge at unprecedented scale.

As a Senior Software Engineer on this team, you'll own complex engineering projects spanning web crawling, data ingestion, search infrastructure, and petabyte-scale distributed systems. You'll collaborate across Data Processing, Architecture, and Scaling teams while navigating compliance challenges with our legal partners.

This role demands deep expertise in building systems that process unimaginable volumes of data with reliability and efficiency. From architecting search algorithms to deploying Kubernetes-based infrastructure, you'll tackle problems that define the frontier of AI data engineering.

San Francisco-based candidates preferred, with opportunities for exceptional remote talent. OpenAI offers competitive compensation packages including equity in one of the world's most promising AI companies.

Key Responsibilities

Your day-to-day will include leading end-to-end engineering projects in data acquisition. This encompasses designing web crawlers that respect robots.txt while maximizing coverage, building ingestion pipelines that handle petabytes without data loss, and creating search systems that power our training infrastructure.

  • Lead ownership of data acquisition projects from conception through production deployment
  • Design and implement scalable web crawling infrastructure serving AI model training
  • Architect distributed data ingestion systems processing petabytes daily
  • Build advanced search and indexing algorithms optimized for massive datasets
  • Develop backend services using key-value stores with complex synchronization requirements
  • Deploy infrastructure using Kubernetes and Infrastructure-as-Code methodologies
  • Collaborate cross-functionally with Data Processing, Architecture, and Scaling teams
  • Partner with Legal on compliance, robots.txt implementation, and data privacy
  • Conduct large-scale experiments analyzing system performance at web scale
  • Perform production system monitoring, alerting, and optimization
  • Mentor engineers on distributed systems best practices and data engineering
  • Contribute to strategic roadmap planning for next-generation data infrastructure

Expect to work on systems impacting billions of web pages and handling data volumes that would overwhelm traditional infrastructure.

Qualifications

We're seeking proven senior engineers with deep distributed systems experience:

  • BS/MS/PhD in Computer Science, Electrical Engineering, or equivalent
  • 6+ years industry experience building production software systems
  • Strong track record with large stateful distributed systems
  • Experience with web crawlers, data pipelines, or search infrastructure (strongly preferred)
  • Deep Kubernetes expertise including operators, custom resources, and IaC
  • Production experience with key-value databases (RocksDB, Cassandra, etc.)
  • Backend systems development in Python, Go, C++, or similar
  • Demonstrated ability to architect complex, reliable distributed systems
  • Experience with large-scale experimentation and performance analysis
  • Excellent cross-functional collaboration and communication skills

Candidates with experience at web-scale companies (Google, Meta, etc.) or large-scale data platforms will be prioritized.

Salary & Benefits

Compensation Range: $280,000 - $420,000 base salary + equity + comprehensive benefits. Total compensation includes significant equity upside in OpenAI.

Exceptional Benefits Package:

  • Top-tier medical, dental, vision coverage
  • 401(k) with generous company match
  • Unlimited PTO with recharge encouragement
  • 16+ weeks parental leave
  • Mental health support programs
  • Fitness stipend and wellness reimbursement
  • Professional development budget
  • Stock options with substantial growth potential
  • Relocation assistance for SF moves
  • Catered meals, gym access, and more

Why Join OpenAI?

OpenAI isn't just building AI - we're creating general artificial intelligence that benefits all humanity. Your work on the Data Acquisition team directly enables our most advanced models, from GPT-4 to future systems that will transform every industry.

Join a team of the world's top engineers solving problems at unprecedented scale. Work with petabytes of data, deploy systems serving billions, and collaborate with researchers pushing AI frontiers. OpenAI offers:

  • Impact: Your systems power models used by millions worldwide
  • Growth: Rapid career advancement in high-impact environment
  • Culture: Mission-driven team valuing diverse perspectives
  • Resources: Best-in-class tools, hardware, and compute
  • Location: Vibrant San Francisco tech ecosystem

We're committed to equal opportunity and building AI safety from the ground up.

How to Apply

Ready to build the data infrastructure powering humanity's AI future? Submit your resume and a brief note about your most impactful distributed systems project. We're excited to review strong candidates immediately.

Application Link: [Apply Now Button]

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

294,000 - 462,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Distributed Systemsintermediate
  • Web Crawlingintermediate
  • Kubernetesintermediate
  • Infrastructure as Codeintermediate
  • Data Ingestionintermediate
  • Scalable Systemsintermediate
  • Data Processingintermediate
  • Search Algorithmsintermediate
  • Key-Value Databasesintermediate
  • Backend Servicesintermediate
  • Pythonintermediate
  • Gointermediate
  • Data Indexingintermediate
  • System Performance Analysisintermediate
  • Compliance and Privacyintermediate
  • Petabyte-Scale Dataintermediate
  • Microservices Architectureintermediate
  • Cloud Infrastructureintermediate
  • DevOpsintermediate
  • Experimentation and A/B Testingintermediate

Required Qualifications

  • BS/MS/PhD in Computer Science or related field (experience)
  • 6+ years of industry experience in software development (experience)
  • Strong expertise in large stateful distributed systems (experience)
  • Experience with large web crawlers (highly preferred) (experience)
  • Proficiency in Kubernetes and Infrastructure-as-Code (experience)
  • Deep knowledge of data processing pipelines (experience)
  • Experience building scalable backend services (experience)
  • Familiarity with key-value databases and synchronization (experience)
  • Ability to architect data indexing and search algorithms (experience)
  • Strong skills in system performance analysis and experimentation (experience)
  • Excellent communication skills, written and verbal (experience)
  • Proven ability to handle multiple tasks and adapt to priorities (experience)
  • Enthusiasm for new technologies and approaches (experience)

Responsibilities

  • Own and lead engineering projects in data acquisition including web crawling and data ingestion
  • Collaborate with Data Processing, Architecture, and Scaling teams for seamless data flow
  • Work with legal team to ensure compliance and data privacy standards
  • Develop and deploy highly scalable distributed systems handling petabytes of data
  • Architect and implement advanced algorithms for data indexing and search
  • Build and maintain robust backend services for data storage and synchronization
  • Deploy solutions in Kubernetes using Infrastructure-as-Code practices
  • Perform routine system checks and monitoring in production environments
  • Conduct experiments on large datasets to analyze system performance
  • Optimize web crawling infrastructure for efficiency and scale
  • Design fault-tolerant systems for mission-critical data operations
  • Contribute to strategic planning for data acquisition roadmap
  • Mentor junior engineers on distributed systems best practices

Benefits

  • general: Comprehensive medical, dental, and vision insurance
  • general: 401(k) retirement plan with company matching
  • general: Unlimited PTO with encouraged recharge periods
  • general: Generous parental leave policy (16+ weeks)
  • general: Mental health support through dedicated programs
  • general: Fitness stipend and wellness reimbursement
  • general: Learning and development budget for conferences/courses
  • general: Stock options in a high-growth AI company
  • general: Commuter benefits and relocation assistance
  • general: Catered meals and fully stocked kitchens
  • general: Onsite gym and wellness facilities
  • general: Volunteer time off and charitable matching
  • general: Cutting-edge hardware and equipment
  • general: Flexible work arrangements when appropriate

Target Your Resume for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Senior Software Engineer OpenAIData Acquisition Engineer jobsDistributed systems engineer San FranciscoWeb crawling engineer careersKubernetes engineer OpenAIPetabyte scale data engineerAI data infrastructure jobsSoftware engineer data acquisitionOpenAI engineering careers SFLarge scale web crawler jobsBackend engineer distributed systemsInfrastructure as Code KubernetesSearch algorithm engineer AIData ingestion pipeline engineerOpenAI San Francisco jobsSenior data engineer AI companyGPT data acquisition rolesScalable systems engineer OpenAIKey-value database engineerAI training data infrastructureHigh performance computing engineerProduction systems data engineerResearch

Answer 10 quick questions to check your fit for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

OpenAI logo

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

full-timePosted: Feb 10, 2026

Job Description

Senior Software Engineer, Data Acquisition at OpenAI - San Francisco, CA

Join OpenAI's Data Acquisition team and shape the future of AI training data infrastructure. As a Senior Software Engineer, you'll lead projects handling petabytes of web-scale data while working on the cutting edge of distributed systems technology.

Role Overview

The Data Acquisition team within OpenAI's Foundations organization powers all aspects of data collection for our groundbreaking model training operations. Managing web crawling, GPTBot services, and massive data pipelines, this team ensures our AI systems have access to the world's knowledge at unprecedented scale.

As a Senior Software Engineer on this team, you'll own complex engineering projects spanning web crawling, data ingestion, search infrastructure, and petabyte-scale distributed systems. You'll collaborate across Data Processing, Architecture, and Scaling teams while navigating compliance challenges with our legal partners.

This role demands deep expertise in building systems that process unimaginable volumes of data with reliability and efficiency. From architecting search algorithms to deploying Kubernetes-based infrastructure, you'll tackle problems that define the frontier of AI data engineering.

San Francisco-based candidates preferred, with opportunities for exceptional remote talent. OpenAI offers competitive compensation packages including equity in one of the world's most promising AI companies.

Key Responsibilities

Your day-to-day will include leading end-to-end engineering projects in data acquisition. This encompasses designing web crawlers that respect robots.txt while maximizing coverage, building ingestion pipelines that handle petabytes without data loss, and creating search systems that power our training infrastructure.

  • Lead ownership of data acquisition projects from conception through production deployment
  • Design and implement scalable web crawling infrastructure serving AI model training
  • Architect distributed data ingestion systems processing petabytes daily
  • Build advanced search and indexing algorithms optimized for massive datasets
  • Develop backend services using key-value stores with complex synchronization requirements
  • Deploy infrastructure using Kubernetes and Infrastructure-as-Code methodologies
  • Collaborate cross-functionally with Data Processing, Architecture, and Scaling teams
  • Partner with Legal on compliance, robots.txt implementation, and data privacy
  • Conduct large-scale experiments analyzing system performance at web scale
  • Perform production system monitoring, alerting, and optimization
  • Mentor engineers on distributed systems best practices and data engineering
  • Contribute to strategic roadmap planning for next-generation data infrastructure

Expect to work on systems impacting billions of web pages and handling data volumes that would overwhelm traditional infrastructure.

Qualifications

We're seeking proven senior engineers with deep distributed systems experience:

  • BS/MS/PhD in Computer Science, Electrical Engineering, or equivalent
  • 6+ years industry experience building production software systems
  • Strong track record with large stateful distributed systems
  • Experience with web crawlers, data pipelines, or search infrastructure (strongly preferred)
  • Deep Kubernetes expertise including operators, custom resources, and IaC
  • Production experience with key-value databases (RocksDB, Cassandra, etc.)
  • Backend systems development in Python, Go, C++, or similar
  • Demonstrated ability to architect complex, reliable distributed systems
  • Experience with large-scale experimentation and performance analysis
  • Excellent cross-functional collaboration and communication skills

Candidates with experience at web-scale companies (Google, Meta, etc.) or large-scale data platforms will be prioritized.

Salary & Benefits

Compensation Range: $280,000 - $420,000 base salary + equity + comprehensive benefits. Total compensation includes significant equity upside in OpenAI.

Exceptional Benefits Package:

  • Top-tier medical, dental, vision coverage
  • 401(k) with generous company match
  • Unlimited PTO with recharge encouragement
  • 16+ weeks parental leave
  • Mental health support programs
  • Fitness stipend and wellness reimbursement
  • Professional development budget
  • Stock options with substantial growth potential
  • Relocation assistance for SF moves
  • Catered meals, gym access, and more

Why Join OpenAI?

OpenAI isn't just building AI - we're creating general artificial intelligence that benefits all humanity. Your work on the Data Acquisition team directly enables our most advanced models, from GPT-4 to future systems that will transform every industry.

Join a team of the world's top engineers solving problems at unprecedented scale. Work with petabytes of data, deploy systems serving billions, and collaborate with researchers pushing AI frontiers. OpenAI offers:

  • Impact: Your systems power models used by millions worldwide
  • Growth: Rapid career advancement in high-impact environment
  • Culture: Mission-driven team valuing diverse perspectives
  • Resources: Best-in-class tools, hardware, and compute
  • Location: Vibrant San Francisco tech ecosystem

We're committed to equal opportunity and building AI safety from the ground up.

How to Apply

Ready to build the data infrastructure powering humanity's AI future? Submit your resume and a brief note about your most impactful distributed systems project. We're excited to review strong candidates immediately.

Application Link: [Apply Now Button]

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Locations

  • San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

294,000 - 462,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Distributed Systemsintermediate
  • Web Crawlingintermediate
  • Kubernetesintermediate
  • Infrastructure as Codeintermediate
  • Data Ingestionintermediate
  • Scalable Systemsintermediate
  • Data Processingintermediate
  • Search Algorithmsintermediate
  • Key-Value Databasesintermediate
  • Backend Servicesintermediate
  • Pythonintermediate
  • Gointermediate
  • Data Indexingintermediate
  • System Performance Analysisintermediate
  • Compliance and Privacyintermediate
  • Petabyte-Scale Dataintermediate
  • Microservices Architectureintermediate
  • Cloud Infrastructureintermediate
  • DevOpsintermediate
  • Experimentation and A/B Testingintermediate

Required Qualifications

  • BS/MS/PhD in Computer Science or related field (experience)
  • 6+ years of industry experience in software development (experience)
  • Strong expertise in large stateful distributed systems (experience)
  • Experience with large web crawlers (highly preferred) (experience)
  • Proficiency in Kubernetes and Infrastructure-as-Code (experience)
  • Deep knowledge of data processing pipelines (experience)
  • Experience building scalable backend services (experience)
  • Familiarity with key-value databases and synchronization (experience)
  • Ability to architect data indexing and search algorithms (experience)
  • Strong skills in system performance analysis and experimentation (experience)
  • Excellent communication skills, written and verbal (experience)
  • Proven ability to handle multiple tasks and adapt to priorities (experience)
  • Enthusiasm for new technologies and approaches (experience)

Responsibilities

  • Own and lead engineering projects in data acquisition including web crawling and data ingestion
  • Collaborate with Data Processing, Architecture, and Scaling teams for seamless data flow
  • Work with legal team to ensure compliance and data privacy standards
  • Develop and deploy highly scalable distributed systems handling petabytes of data
  • Architect and implement advanced algorithms for data indexing and search
  • Build and maintain robust backend services for data storage and synchronization
  • Deploy solutions in Kubernetes using Infrastructure-as-Code practices
  • Perform routine system checks and monitoring in production environments
  • Conduct experiments on large datasets to analyze system performance
  • Optimize web crawling infrastructure for efficiency and scale
  • Design fault-tolerant systems for mission-critical data operations
  • Contribute to strategic planning for data acquisition roadmap
  • Mentor junior engineers on distributed systems best practices

Benefits

  • general: Comprehensive medical, dental, and vision insurance
  • general: 401(k) retirement plan with company matching
  • general: Unlimited PTO with encouraged recharge periods
  • general: Generous parental leave policy (16+ weeks)
  • general: Mental health support through dedicated programs
  • general: Fitness stipend and wellness reimbursement
  • general: Learning and development budget for conferences/courses
  • general: Stock options in a high-growth AI company
  • general: Commuter benefits and relocation assistance
  • general: Catered meals and fully stocked kitchens
  • general: Onsite gym and wellness facilities
  • general: Volunteer time off and charitable matching
  • general: Cutting-edge hardware and equipment
  • general: Flexible work arrangements when appropriate

Target Your Resume for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Senior Software Engineer OpenAIData Acquisition Engineer jobsDistributed systems engineer San FranciscoWeb crawling engineer careersKubernetes engineer OpenAIPetabyte scale data engineerAI data infrastructure jobsSoftware engineer data acquisitionOpenAI engineering careers SFLarge scale web crawler jobsBackend engineer distributed systemsInfrastructure as Code KubernetesSearch algorithm engineer AIData ingestion pipeline engineerOpenAI San Francisco jobsSenior data engineer AI companyGPT data acquisition rolesScalable systems engineer OpenAIKey-value database engineerAI training data infrastructureHigh performance computing engineerProduction systems data engineerResearch

Answer 10 quick questions to check your fit for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.