Resume and JobRESUME AND JOB
Apple logo

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

Software and Technology Jobs

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

full-timePosted: Aug 19, 2025

Job Description

Apple has an extraordinary reputation for product quality. We are looking for a versatile Machine Learning Engineer with a strong background in Large Language Models (LLMs) to build the next generation ML evaluation frameworks and tools. In this role, you will use LLMs and other ML techniques to help automate large-scale data generation and evaluation job execution on server or on device, build LLM judges, detect anomalies, and streamline ML evaluation workflows. This is a high-impact role where you'll work at the intersection of AI/ML, conversational agents, information retrieval, software engineering, and ML evaluation, helping us push the boundaries of how AI can transform ML evaluation. Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as: - Automatic large scale data generation - Automatic UI and Non UI test evaluation - Run evaluation jobs at scale - Build and optimize LLM judges - Intelligent log summarization and anomaly detection - Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications - Collaborate with QA teams to integrate models into testing frameworks - Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining - Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Locations

  • Cupertino, California, United States 95014

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 60,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Machine Learningintermediate
  • Large Language Models (LLMs)intermediate
  • ML evaluation frameworksintermediate
  • ML techniquesintermediate
  • data generationintermediate
  • evaluation job executionintermediate
  • LLM judgesintermediate
  • anomaly detectionintermediate
  • ML evaluation workflowsintermediate
  • AI/MLintermediate
  • conversational agentsintermediate
  • information retrievalintermediate
  • software engineeringintermediate
  • Automatic large scale data generationintermediate
  • Automatic UI test evaluationintermediate
  • Automatic Non UI test evaluationintermediate
  • Run evaluation jobs at scaleintermediate
  • Build and optimize LLM judgesintermediate
  • Intelligent log summarizationintermediate
  • Fine-tune foundation modelsintermediate
  • prompt-engineer foundation modelsintermediate
  • Collaborate with QA teamsintermediate
  • integrate models into testing frameworksintermediate
  • A/B testingintermediate
  • human feedback loopsintermediate
  • retrainingintermediate
  • Monitor advances in LLMsintermediate
  • NLPintermediate

Required Qualifications

  • 3+ years of proven ability in machine learning, including hands-on work with LLMs. (experience, 3 years)
  • Strong programming skills in Python and experience with ML/NLP libraries (experience)
  • Experience building or fine-tuning LLMs for software engineering tasks (experience)
  • Understanding of prompt engineering, and retrieval-augmented generation (RAG) (experience)
  • Experience developing LLM based automated evaluation frameworks (experience)
  • Excellent knowledge of software testing methodologies & practices (experience)

Preferred Qualifications

  • Experience in Swift/XCTest/XCUITest is preferred (experience)
  • Ability to thrive in a collaborative working environment within your team and beyond (experience)
  • Ability to triage problems, prioritize accordingly, and propose resolutions (experience)

Responsibilities

  • Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as:
  • - Automatic large scale data generation
  • - Automatic UI and Non UI test evaluation
  • - Run evaluation jobs at scale
  • - Build and optimize LLM judges
  • - Intelligent log summarization and anomaly detection
  • - Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
  • - Collaborate with QA teams to integrate models into testing frameworks
  • - Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
  • - Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Target Your Resume for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Get personalized recommendations to optimize your resume specifically for ML Engineer - Evaluation Automation, Siri AI Quality Engineering . Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Hardware

Answer 10 quick questions to check your fit for ML Engineer - Evaluation Automation, Siri AI Quality Engineering @ Apple.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Apple logo

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

Apple

Software and Technology Jobs

ML Engineer - Evaluation Automation, Siri AI Quality Engineering

full-timePosted: Aug 19, 2025

Job Description

Apple has an extraordinary reputation for product quality. We are looking for a versatile Machine Learning Engineer with a strong background in Large Language Models (LLMs) to build the next generation ML evaluation frameworks and tools. In this role, you will use LLMs and other ML techniques to help automate large-scale data generation and evaluation job execution on server or on device, build LLM judges, detect anomalies, and streamline ML evaluation workflows. This is a high-impact role where you'll work at the intersection of AI/ML, conversational agents, information retrieval, software engineering, and ML evaluation, helping us push the boundaries of how AI can transform ML evaluation. Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as: - Automatic large scale data generation - Automatic UI and Non UI test evaluation - Run evaluation jobs at scale - Build and optimize LLM judges - Intelligent log summarization and anomaly detection - Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications - Collaborate with QA teams to integrate models into testing frameworks - Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining - Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Locations

  • Cupertino, California, United States 95014

Salary

Estimated Salary Rangemedium confidence

25,000,000 - 60,000,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Machine Learningintermediate
  • Large Language Models (LLMs)intermediate
  • ML evaluation frameworksintermediate
  • ML techniquesintermediate
  • data generationintermediate
  • evaluation job executionintermediate
  • LLM judgesintermediate
  • anomaly detectionintermediate
  • ML evaluation workflowsintermediate
  • AI/MLintermediate
  • conversational agentsintermediate
  • information retrievalintermediate
  • software engineeringintermediate
  • Automatic large scale data generationintermediate
  • Automatic UI test evaluationintermediate
  • Automatic Non UI test evaluationintermediate
  • Run evaluation jobs at scaleintermediate
  • Build and optimize LLM judgesintermediate
  • Intelligent log summarizationintermediate
  • Fine-tune foundation modelsintermediate
  • prompt-engineer foundation modelsintermediate
  • Collaborate with QA teamsintermediate
  • integrate models into testing frameworksintermediate
  • A/B testingintermediate
  • human feedback loopsintermediate
  • retrainingintermediate
  • Monitor advances in LLMsintermediate
  • NLPintermediate

Required Qualifications

  • 3+ years of proven ability in machine learning, including hands-on work with LLMs. (experience, 3 years)
  • Strong programming skills in Python and experience with ML/NLP libraries (experience)
  • Experience building or fine-tuning LLMs for software engineering tasks (experience)
  • Understanding of prompt engineering, and retrieval-augmented generation (RAG) (experience)
  • Experience developing LLM based automated evaluation frameworks (experience)
  • Excellent knowledge of software testing methodologies & practices (experience)

Preferred Qualifications

  • Experience in Swift/XCTest/XCUITest is preferred (experience)
  • Ability to thrive in a collaborative working environment within your team and beyond (experience)
  • Ability to triage problems, prioritize accordingly, and propose resolutions (experience)

Responsibilities

  • Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as:
  • - Automatic large scale data generation
  • - Automatic UI and Non UI test evaluation
  • - Run evaluation jobs at scale
  • - Build and optimize LLM judges
  • - Intelligent log summarization and anomaly detection
  • - Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
  • - Collaborate with QA teams to integrate models into testing frameworks
  • - Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
  • - Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

Target Your Resume for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Get personalized recommendations to optimize your resume specifically for ML Engineer - Evaluation Automation, Siri AI Quality Engineering . Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "ML Engineer - Evaluation Automation, Siri AI Quality Engineering " , Apple

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Hardware

Answer 10 quick questions to check your fit for ML Engineer - Evaluation Automation, Siri AI Quality Engineering @ Apple.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.