RESUME AND JOB

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

full-timePosted: Aug 19, 2025

Job Description

Apple has an extraordinary reputation for product quality. We are looking for a versatile Machine Learning Engineer with a strong background in Large Language Models (LLMs) to build the next generation ML evaluation frameworks and tools. In this role, you will use LLMs and other ML techniques to help automate large-scale data generation and evaluation job execution on server or on device, build LLM judges, detect anomalies, and streamline ML evaluation workflows. This is a high-impact role where you'll work at the intersection of AI/ML, conversational agents, information retrieval, software engineering, and ML evaluation, helping us push the boundaries of how AI can transform ML evaluation. Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as: - Automatic large scale data generation - Automatic UI and Non UI test evaluation - Run evaluation jobs at scale - Build and optimize LLM judges - Intelligent log summarization and anomaly detection - Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications - Collaborate with QA teams to integrate models into testing frameworks - Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining - Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Locations

Cupertino, California, United States 95014

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 60,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Machine Learningintermediate
Large Language Models (LLMs)intermediate
ML evaluation frameworksintermediate
ML techniquesintermediate
data generationintermediate
evaluation job executionintermediate
LLM judgesintermediate
anomaly detectionintermediate
ML evaluation workflowsintermediate
AI/MLintermediate
conversational agentsintermediate
information retrievalintermediate
software engineeringintermediate
Automatic large scale data generationintermediate
Automatic UI test evaluationintermediate
Automatic Non UI test evaluationintermediate
Run evaluation jobs at scaleintermediate
Build and optimize LLM judgesintermediate
Intelligent log summarizationintermediate
Fine-tune foundation modelsintermediate
prompt-engineer foundation modelsintermediate
Collaborate with QA teamsintermediate
integrate models into testing frameworksintermediate
A/B testingintermediate
human feedback loopsintermediate
retrainingintermediate
Monitor advances in LLMsintermediate
NLPintermediate

Required Qualifications

3+ years of proven ability in machine learning, including hands-on work with LLMs. (experience, 3 years)
Strong programming skills in Python and experience with ML/NLP libraries (experience)
Experience building or fine-tuning LLMs for software engineering tasks (experience)
Understanding of prompt engineering, and retrieval-augmented generation (RAG) (experience)
Experience developing LLM based automated evaluation frameworks (experience)
Excellent knowledge of software testing methodologies & practices (experience)

Preferred Qualifications

Experience in Swift/XCTest/XCUITest is preferred (experience)
Ability to thrive in a collaborative working environment within your team and beyond (experience)
Ability to triage problems, prioritize accordingly, and propose resolutions (experience)

Responsibilities

Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as:
- Automatic large scale data generation
- Automatic UI and Non UI test evaluation
- Run evaluation jobs at scale
- Build and optimize LLM judges
- Intelligent log summarization and anomaly detection
- Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
- Collaborate with QA teams to integrate models into testing frameworks
- Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
- Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Target Your Resume for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Get personalized recommendations to optimize your resume specifically for ML Engineer - Evaluation Automation, Siri AI Quality Engineering . Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Hardware

Answer 10 quick questions to check your fit for ML Engineer - Evaluation Automation, Siri AI Quality Engineering @ Apple.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

full-timePosted: Aug 19, 2025

Job Description

Locations

Cupertino, California, United States 95014

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 60,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Machine Learningintermediate
Large Language Models (LLMs)intermediate
ML evaluation frameworksintermediate
ML techniquesintermediate
data generationintermediate
evaluation job executionintermediate
LLM judgesintermediate
anomaly detectionintermediate
ML evaluation workflowsintermediate
AI/MLintermediate
conversational agentsintermediate
information retrievalintermediate
software engineeringintermediate
Automatic large scale data generationintermediate
Automatic UI test evaluationintermediate
Automatic Non UI test evaluationintermediate
Run evaluation jobs at scaleintermediate
Build and optimize LLM judgesintermediate
Intelligent log summarizationintermediate
Fine-tune foundation modelsintermediate
prompt-engineer foundation modelsintermediate
Collaborate with QA teamsintermediate
integrate models into testing frameworksintermediate
A/B testingintermediate
human feedback loopsintermediate
retrainingintermediate
Monitor advances in LLMsintermediate
NLPintermediate

Required Qualifications

3+ years of proven ability in machine learning, including hands-on work with LLMs. (experience, 3 years)
Strong programming skills in Python and experience with ML/NLP libraries (experience)
Experience building or fine-tuning LLMs for software engineering tasks (experience)
Understanding of prompt engineering, and retrieval-augmented generation (RAG) (experience)
Experience developing LLM based automated evaluation frameworks (experience)
Excellent knowledge of software testing methodologies & practices (experience)

Preferred Qualifications

Experience in Swift/XCTest/XCUITest is preferred (experience)
Ability to thrive in a collaborative working environment within your team and beyond (experience)
Ability to triage problems, prioritize accordingly, and propose resolutions (experience)

Responsibilities

Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as:
- Automatic large scale data generation
- Automatic UI and Non UI test evaluation
- Run evaluation jobs at scale
- Build and optimize LLM judges
- Intelligent log summarization and anomaly detection
- Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
- Collaborate with QA teams to integrate models into testing frameworks
- Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
- Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Target Your Resume for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Get personalized recommendations to optimize your resume specifically for ML Engineer - Evaluation Automation, Siri AI Quality Engineering . Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Hardware

Answer 10 quick questions to check your fit for ML Engineer - Evaluation Automation, Siri AI Quality Engineering @ Apple.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap