Resume and JobRESUME AND JOB
Oracle logo

Performance Benchmarking Engineer - Cluster Networking and AI

Oracle

Engineering Jobs

Performance Benchmarking Engineer - Cluster Networking and AI

full-timePosted: Sep 23, 2025

Job Description

Overview

OCI AI Infrastructure leads in building cutting-edge GPU supercomputers scaling to tens of thousands of GPUs without performance loss. Our team are experts in RDMA cluster architecture and its impact on AI/ML/HPC performance. We leverage this expertise to optimize RDMA network designs, enabling customers to advance AI/ML and HPC frontiers.

Responsibilities

  • Design and execute performance benchmarks for RDMA-based GPU clusters at massive scale.
  • Analyze network bottlenecks in AI/ML/HPC workloads using RDMA fabrics.
  • Collaborate with hardware engineers to optimize cluster networking architectures.
  • Develop tools and methodologies for measuring end-to-end cluster performance.
  • Drive performance tuning for multi-tenant GPU supercomputer environments.
  • Research emerging RDMA technologies and their AI/HPC implications.
  • Create performance models predicting scalability for 10k+ GPU clusters.
  • Document findings and present recommendations to stakeholders and customers.

Qualifications

  • BS/MS in Computer Science, Electrical Engineering, or related field.
  • 5+ years experience in high-performance networking or HPC systems.
  • Deep knowledge of RDMA (RoCE, IB) and cluster interconnects.
  • Strong experience with GPU computing and AI/ML workloads.
  • Proficiency in performance analysis tools (e.g., perf, nsight, ibv tools).
  • Excellent programming skills in C/C++, Python for benchmarking.

Benefits

  • Competitive salary and equity package.
  • Comprehensive health, dental, and vision insurance.
  • 401(k) matching and employee stock purchase plan.
  • Flexible PTO and remote/hybrid work options.
  • Professional development budget and conference sponsorships.

Locations

  • Seattle, WA, United States

Salary

Skills Required

  • RDMA cluster architectureintermediate
  • AI/ML/HPC performance optimizationintermediate
  • RDMA network designintermediate

Responsibilities

  • Build cutting-edge GPU supercomputers that scale to tens of thousands of GPUs without compromising performance
  • Be the go-to experts on RDMA cluster architecture and its relationship to AI/ML/HPC performance
  • Apply deep understanding of unique workload demands to RDMA network design for customers pushing the cutting edge in AI/ML and HPC

Benefits

  • general: Health Insurance
  • general: 401(k)
  • general: Stock Options
  • general: Flexible PTO

Target Your Resume for "Performance Benchmarking Engineer - Cluster Networking and AI" , Oracle

Get personalized recommendations to optimize your resume specifically for Performance Benchmarking Engineer - Cluster Networking and AI. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Performance Benchmarking Engineer - Cluster Networking and AI" , Oracle

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Answer 10 quick questions to check your fit for Performance Benchmarking Engineer - Cluster Networking and AI @ Oracle.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Oracle logo

Performance Benchmarking Engineer - Cluster Networking and AI

Oracle

Engineering Jobs

Performance Benchmarking Engineer - Cluster Networking and AI

full-timePosted: Sep 23, 2025

Job Description

Overview

OCI AI Infrastructure leads in building cutting-edge GPU supercomputers scaling to tens of thousands of GPUs without performance loss. Our team are experts in RDMA cluster architecture and its impact on AI/ML/HPC performance. We leverage this expertise to optimize RDMA network designs, enabling customers to advance AI/ML and HPC frontiers.

Responsibilities

  • Design and execute performance benchmarks for RDMA-based GPU clusters at massive scale.
  • Analyze network bottlenecks in AI/ML/HPC workloads using RDMA fabrics.
  • Collaborate with hardware engineers to optimize cluster networking architectures.
  • Develop tools and methodologies for measuring end-to-end cluster performance.
  • Drive performance tuning for multi-tenant GPU supercomputer environments.
  • Research emerging RDMA technologies and their AI/HPC implications.
  • Create performance models predicting scalability for 10k+ GPU clusters.
  • Document findings and present recommendations to stakeholders and customers.

Qualifications

  • BS/MS in Computer Science, Electrical Engineering, or related field.
  • 5+ years experience in high-performance networking or HPC systems.
  • Deep knowledge of RDMA (RoCE, IB) and cluster interconnects.
  • Strong experience with GPU computing and AI/ML workloads.
  • Proficiency in performance analysis tools (e.g., perf, nsight, ibv tools).
  • Excellent programming skills in C/C++, Python for benchmarking.

Benefits

  • Competitive salary and equity package.
  • Comprehensive health, dental, and vision insurance.
  • 401(k) matching and employee stock purchase plan.
  • Flexible PTO and remote/hybrid work options.
  • Professional development budget and conference sponsorships.

Locations

  • Seattle, WA, United States

Salary

Skills Required

  • RDMA cluster architectureintermediate
  • AI/ML/HPC performance optimizationintermediate
  • RDMA network designintermediate

Responsibilities

  • Build cutting-edge GPU supercomputers that scale to tens of thousands of GPUs without compromising performance
  • Be the go-to experts on RDMA cluster architecture and its relationship to AI/ML/HPC performance
  • Apply deep understanding of unique workload demands to RDMA network design for customers pushing the cutting edge in AI/ML and HPC

Benefits

  • general: Health Insurance
  • general: 401(k)
  • general: Stock Options
  • general: Flexible PTO

Target Your Resume for "Performance Benchmarking Engineer - Cluster Networking and AI" , Oracle

Get personalized recommendations to optimize your resume specifically for Performance Benchmarking Engineer - Cluster Networking and AI. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Performance Benchmarking Engineer - Cluster Networking and AI" , Oracle

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Answer 10 quick questions to check your fit for Performance Benchmarking Engineer - Cluster Networking and AI @ Oracle.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.