RESUME AND JOB

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

full-timePosted: Feb 10, 2026

Job Description

Senior Software Engineer, Data Acquisition at OpenAI - San Francisco, CA

Join OpenAI's Data Acquisition team and shape the future of AI training data infrastructure. As a Senior Software Engineer, you'll lead projects handling petabytes of web-scale data while working on the cutting edge of distributed systems technology.

Role Overview

The Data Acquisition team within OpenAI's Foundations organization powers all aspects of data collection for our groundbreaking model training operations. Managing web crawling, GPTBot services, and massive data pipelines, this team ensures our AI systems have access to the world's knowledge at unprecedented scale.

As a Senior Software Engineer on this team, you'll own complex engineering projects spanning web crawling, data ingestion, search infrastructure, and petabyte-scale distributed systems. You'll collaborate across Data Processing, Architecture, and Scaling teams while navigating compliance challenges with our legal partners.

This role demands deep expertise in building systems that process unimaginable volumes of data with reliability and efficiency. From architecting search algorithms to deploying Kubernetes-based infrastructure, you'll tackle problems that define the frontier of AI data engineering.

San Francisco-based candidates preferred, with opportunities for exceptional remote talent. OpenAI offers competitive compensation packages including equity in one of the world's most promising AI companies.

Key Responsibilities

Your day-to-day will include leading end-to-end engineering projects in data acquisition. This encompasses designing web crawlers that respect robots.txt while maximizing coverage, building ingestion pipelines that handle petabytes without data loss, and creating search systems that power our training infrastructure.

Lead ownership of data acquisition projects from conception through production deployment
Design and implement scalable web crawling infrastructure serving AI model training
Architect distributed data ingestion systems processing petabytes daily
Build advanced search and indexing algorithms optimized for massive datasets
Develop backend services using key-value stores with complex synchronization requirements
Deploy infrastructure using Kubernetes and Infrastructure-as-Code methodologies
Collaborate cross-functionally with Data Processing, Architecture, and Scaling teams
Partner with Legal on compliance, robots.txt implementation, and data privacy
Conduct large-scale experiments analyzing system performance at web scale
Perform production system monitoring, alerting, and optimization
Mentor engineers on distributed systems best practices and data engineering
Contribute to strategic roadmap planning for next-generation data infrastructure

Expect to work on systems impacting billions of web pages and handling data volumes that would overwhelm traditional infrastructure.

Qualifications

We're seeking proven senior engineers with deep distributed systems experience:

BS/MS/PhD in Computer Science, Electrical Engineering, or equivalent
6+ years industry experience building production software systems
Strong track record with large stateful distributed systems
Experience with web crawlers, data pipelines, or search infrastructure (strongly preferred)
Deep Kubernetes expertise including operators, custom resources, and IaC
Production experience with key-value databases (RocksDB, Cassandra, etc.)
Backend systems development in Python, Go, C++, or similar
Demonstrated ability to architect complex, reliable distributed systems
Experience with large-scale experimentation and performance analysis
Excellent cross-functional collaboration and communication skills

Candidates with experience at web-scale companies (Google, Meta, etc.) or large-scale data platforms will be prioritized.

Salary & Benefits

Compensation Range: $280,000 - $420,000 base salary + equity + comprehensive benefits. Total compensation includes significant equity upside in OpenAI.

Exceptional Benefits Package:

Top-tier medical, dental, vision coverage
401(k) with generous company match
Unlimited PTO with recharge encouragement
16+ weeks parental leave
Mental health support programs
Fitness stipend and wellness reimbursement
Professional development budget
Stock options with substantial growth potential
Relocation assistance for SF moves
Catered meals, gym access, and more

Why Join OpenAI?

OpenAI isn't just building AI - we're creating general artificial intelligence that benefits all humanity. Your work on the Data Acquisition team directly enables our most advanced models, from GPT-4 to future systems that will transform every industry.

Join a team of the world's top engineers solving problems at unprecedented scale. Work with petabytes of data, deploy systems serving billions, and collaborate with researchers pushing AI frontiers. OpenAI offers:

Impact: Your systems power models used by millions worldwide
Growth: Rapid career advancement in high-impact environment
Culture: Mission-driven team valuing diverse perspectives
Resources: Best-in-class tools, hardware, and compute
Location: Vibrant San Francisco tech ecosystem

We're committed to equal opportunity and building AI safety from the ground up.

How to Apply

Ready to build the data infrastructure powering humanity's AI future? Submit your resume and a brief note about your most impactful distributed systems project. We're excited to review strong candidates immediately.

Application Link: [Apply Now Button]

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Locations

San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

294,000 - 462,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Distributed Systemsintermediate
Web Crawlingintermediate
Kubernetesintermediate
Infrastructure as Codeintermediate
Data Ingestionintermediate
Scalable Systemsintermediate
Data Processingintermediate
Search Algorithmsintermediate
Key-Value Databasesintermediate
Backend Servicesintermediate
Pythonintermediate
Gointermediate
Data Indexingintermediate
System Performance Analysisintermediate
Compliance and Privacyintermediate
Petabyte-Scale Dataintermediate
Microservices Architectureintermediate
Cloud Infrastructureintermediate
DevOpsintermediate
Experimentation and A/B Testingintermediate

Required Qualifications

BS/MS/PhD in Computer Science or related field (experience)
6+ years of industry experience in software development (experience)
Strong expertise in large stateful distributed systems (experience)
Experience with large web crawlers (highly preferred) (experience)
Proficiency in Kubernetes and Infrastructure-as-Code (experience)
Deep knowledge of data processing pipelines (experience)
Experience building scalable backend services (experience)
Familiarity with key-value databases and synchronization (experience)
Ability to architect data indexing and search algorithms (experience)
Strong skills in system performance analysis and experimentation (experience)
Excellent communication skills, written and verbal (experience)
Proven ability to handle multiple tasks and adapt to priorities (experience)
Enthusiasm for new technologies and approaches (experience)

Responsibilities

Own and lead engineering projects in data acquisition including web crawling and data ingestion
Collaborate with Data Processing, Architecture, and Scaling teams for seamless data flow
Work with legal team to ensure compliance and data privacy standards
Develop and deploy highly scalable distributed systems handling petabytes of data
Architect and implement advanced algorithms for data indexing and search
Build and maintain robust backend services for data storage and synchronization
Deploy solutions in Kubernetes using Infrastructure-as-Code practices
Perform routine system checks and monitoring in production environments
Conduct experiments on large datasets to analyze system performance
Optimize web crawling infrastructure for efficiency and scale
Design fault-tolerant systems for mission-critical data operations
Contribute to strategic planning for data acquisition roadmap
Mentor junior engineers on distributed systems best practices

Benefits

general: Comprehensive medical, dental, and vision insurance
general: 401(k) retirement plan with company matching
general: Unlimited PTO with encouraged recharge periods
general: Generous parental leave policy (16+ weeks)
general: Mental health support through dedicated programs
general: Fitness stipend and wellness reimbursement
general: Learning and development budget for conferences/courses
general: Stock options in a high-growth AI company
general: Commuter benefits and relocation assistance
general: Catered meals and fully stocked kitchens
general: Onsite gym and wellness facilities
general: Volunteer time off and charitable matching
general: Cutting-edge hardware and equipment
general: Flexible work arrangements when appropriate

Target Your Resume for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Senior Software Engineer OpenAIData Acquisition Engineer jobsDistributed systems engineer San FranciscoWeb crawling engineer careersKubernetes engineer OpenAIPetabyte scale data engineerAI data infrastructure jobsSoftware engineer data acquisitionOpenAI engineering careers SFLarge scale web crawler jobsBackend engineer distributed systemsInfrastructure as Code KubernetesSearch algorithm engineer AIData ingestion pipeline engineerOpenAI San Francisco jobsSenior data engineer AI companyGPT data acquisition rolesScalable systems engineer OpenAIKey-value database engineerAI training data infrastructureHigh performance computing engineerProduction systems data engineerResearch

Answer 10 quick questions to check your fit for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!

OpenAI

full-timePosted: Feb 10, 2026

Job Description

Senior Software Engineer, Data Acquisition at OpenAI - San Francisco, CA

Role Overview

Key Responsibilities

Lead ownership of data acquisition projects from conception through production deployment
Design and implement scalable web crawling infrastructure serving AI model training
Architect distributed data ingestion systems processing petabytes daily
Build advanced search and indexing algorithms optimized for massive datasets
Develop backend services using key-value stores with complex synchronization requirements
Deploy infrastructure using Kubernetes and Infrastructure-as-Code methodologies
Collaborate cross-functionally with Data Processing, Architecture, and Scaling teams
Partner with Legal on compliance, robots.txt implementation, and data privacy
Conduct large-scale experiments analyzing system performance at web scale
Perform production system monitoring, alerting, and optimization
Mentor engineers on distributed systems best practices and data engineering
Contribute to strategic roadmap planning for next-generation data infrastructure

Expect to work on systems impacting billions of web pages and handling data volumes that would overwhelm traditional infrastructure.

Qualifications

We're seeking proven senior engineers with deep distributed systems experience:

BS/MS/PhD in Computer Science, Electrical Engineering, or equivalent
6+ years industry experience building production software systems
Strong track record with large stateful distributed systems
Experience with web crawlers, data pipelines, or search infrastructure (strongly preferred)
Deep Kubernetes expertise including operators, custom resources, and IaC
Production experience with key-value databases (RocksDB, Cassandra, etc.)
Backend systems development in Python, Go, C++, or similar
Demonstrated ability to architect complex, reliable distributed systems
Experience with large-scale experimentation and performance analysis
Excellent cross-functional collaboration and communication skills

Candidates with experience at web-scale companies (Google, Meta, etc.) or large-scale data platforms will be prioritized.

Salary & Benefits

Compensation Range: $280,000 - $420,000 base salary + equity + comprehensive benefits. Total compensation includes significant equity upside in OpenAI.

Exceptional Benefits Package:

Top-tier medical, dental, vision coverage
401(k) with generous company match
Unlimited PTO with recharge encouragement
16+ weeks parental leave
Mental health support programs
Fitness stipend and wellness reimbursement
Professional development budget
Stock options with substantial growth potential
Relocation assistance for SF moves
Catered meals, gym access, and more

Why Join OpenAI?

Impact: Your systems power models used by millions worldwide
Growth: Rapid career advancement in high-impact environment
Culture: Mission-driven team valuing diverse perspectives
Resources: Best-in-class tools, hardware, and compute
Location: Vibrant San Francisco tech ecosystem

We're committed to equal opportunity and building AI safety from the ground up.

How to Apply

Application Link: [Apply Now Button]

OpenAI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Locations

San Francisco, California, United States

Salary

Estimated Salary Rangehigh confidence

294,000 - 462,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Distributed Systemsintermediate
Web Crawlingintermediate
Kubernetesintermediate
Infrastructure as Codeintermediate
Data Ingestionintermediate
Scalable Systemsintermediate
Data Processingintermediate
Search Algorithmsintermediate
Key-Value Databasesintermediate
Backend Servicesintermediate
Pythonintermediate
Gointermediate
Data Indexingintermediate
System Performance Analysisintermediate
Compliance and Privacyintermediate
Petabyte-Scale Dataintermediate
Microservices Architectureintermediate
Cloud Infrastructureintermediate
DevOpsintermediate
Experimentation and A/B Testingintermediate

Required Qualifications

BS/MS/PhD in Computer Science or related field (experience)
6+ years of industry experience in software development (experience)
Strong expertise in large stateful distributed systems (experience)
Experience with large web crawlers (highly preferred) (experience)
Proficiency in Kubernetes and Infrastructure-as-Code (experience)
Deep knowledge of data processing pipelines (experience)
Experience building scalable backend services (experience)
Familiarity with key-value databases and synchronization (experience)
Ability to architect data indexing and search algorithms (experience)
Strong skills in system performance analysis and experimentation (experience)
Excellent communication skills, written and verbal (experience)
Proven ability to handle multiple tasks and adapt to priorities (experience)
Enthusiasm for new technologies and approaches (experience)

Responsibilities

Own and lead engineering projects in data acquisition including web crawling and data ingestion
Collaborate with Data Processing, Architecture, and Scaling teams for seamless data flow
Work with legal team to ensure compliance and data privacy standards
Develop and deploy highly scalable distributed systems handling petabytes of data
Architect and implement advanced algorithms for data indexing and search
Build and maintain robust backend services for data storage and synchronization
Deploy solutions in Kubernetes using Infrastructure-as-Code practices
Perform routine system checks and monitoring in production environments
Conduct experiments on large datasets to analyze system performance
Optimize web crawling infrastructure for efficiency and scale
Design fault-tolerant systems for mission-critical data operations
Contribute to strategic planning for data acquisition roadmap
Mentor junior engineers on distributed systems best practices

Benefits

general: Comprehensive medical, dental, and vision insurance
general: 401(k) retirement plan with company matching
general: Unlimited PTO with encouraged recharge periods
general: Generous parental leave policy (16+ weeks)
general: Mental health support through dedicated programs
general: Fitness stipend and wellness reimbursement
general: Learning and development budget for conferences/courses
general: Stock options in a high-growth AI company
general: Commuter benefits and relocation assistance
general: Catered meals and fully stocked kitchens
general: Onsite gym and wellness facilities
general: Volunteer time off and charitable matching
general: Cutting-edge hardware and equipment
general: Flexible work arrangements when appropriate

Target Your Resume for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Get personalized recommendations to optimize your resume specifically for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now!" , OpenAI

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Answer 10 quick questions to check your fit for Senior Software Engineer, Data Acquisition Careers at OpenAI - San Francisco, California | Apply Now! @ OpenAI.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap