RESUME AND JOB

Research Engineer - Evaluations

Canva

Research Engineer - Evaluations

Canva

internshipPosted: Dec 16, 2025

Job Description

Research Engineer - Evaluations

Location: Team Engineering

Team: Country London / United Kingdom

About the Role

At Canva, our mission is to empower the world to design through innovative generative AI. We're seeking a Research Engineer - Evaluations to build our next-generation evaluation system leveraging automatic evaluations with Multimodal Large Language Models (MLLMs). In this high-impact Engineering role based in London, UK, you'll engineer sophisticated AI agents that autonomously assess the quality, relevance, and human alignment of our generative design models, creating rapid feedback loops that shape the future of design generation for millions of users worldwide. You'll focus on agentic evaluation systems, inference-time alignment techniques like prompt engineering, RAG, and in-context learning, plus rigorous model benchmarking frameworks. Primary responsibilities include designing scalable 'MLLM-as-a-Judge' infrastructure, analyzing failure modes for actionable insights, and collaborating with research scientists to integrate evaluations into Canva's ML lifecycle. Working in our collaborative, design-focused culture, you'll turn cutting-edge research into practical systems that make our AI truly helpful and aligned with creative needs. Canva's innovative environment thrives on creativity and teamwork, where you'll join a world-class Engineering team pushing boundaries in generative design. With hybrid flexibility in London, you'll enjoy equity packages, inclusive benefits, and a vibe that balances hard work with moments of magic. If you excel in PyTorch, distributed training, and cloud environments while passionate about AI that empowers design, join us to redefine how the world creates.

Key Responsibilities

Design, build, and optimize infrastructure for an 'MLLM-as-a-Judge' evaluation system providing scalable, automated feedback
Implement and experiment with inference-time alignment techniques including Prompt Engineering, RAG, and ICL
Establish and manage comprehensive benchmarking processes for foundation models on design-centric tasks
Analyze evaluation data to identify model failure modes and deliver actionable recommendations
Collaborate with research scientists and ML engineers to integrate agentic evaluation into the model development lifecycle
Translate latest research in LLM evaluation and agentic AI into production-ready engineering solutions
Engineer autonomous AI agents using Multimodal Large Language Models to assess generated design quality and human alignment
Build rigorous frameworks for systematic model benchmarking and analysis
Provide rapid feedback loops to guide the evolution of Canva's generative design models
Optimize evaluation systems for speed and scalability across distributed environments

Required Qualifications

Strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures
Practical experience creating data-driven evaluation methodologies for AI models
Experience managing or optimizing large-scale distributed model training across hundreds of GPUs
Solid understanding of machine learning with hands-on experience using PyTorch and code optimization for speed
Disciplined coding practices including experience with code reviews and pull requests
Experience working in cloud environments, ideally AWS
Proven ability to analyze complex data and provide actionable insights

Preferred Qualifications

Familiarity with evaluation libraries and frameworks
Experience building or working with agentic AI systems or multi-agent coordination
Knowledge of data visualization tools for effective communication of findings
Background or interest in human-computer interaction, design principles, or AI ethics

Required Skills

Generative AI model architectures (Diffusion, GANs, Transformers)
PyTorch proficiency and performance optimization
Distributed training across GPU clusters
Cloud infrastructure (AWS preferred)
MLLM and agentic AI systems
Inference-time alignment techniques (Prompt Engineering, RAG, ICL)
Data-driven evaluation methodologies
Model benchmarking and analysis
Code review and disciplined engineering practices
Multimodal evaluation systems
Collaborative problem-solving in research teams
Translating research into production systems
Design quality assessment frameworks
Human alignment evaluation strategies
Scalable automation engineering

Benefits

Equity packages to share in Canva's success
Inclusive parental leave policy supporting all parents and carers
Annual Vibe & Thrive allowance for wellbeing, social connection, and home office setup
Flexible leave options empowering personal recharge and growth
Hybrid work model balancing collaboration and flexibility
Regular moments of magic, connectivity, and fun woven into Canva life
Access to flagship campuses in Sydney, London, and European operations
Opportunities to work on world-class generative AI empowering millions of designers

Canva is an equal opportunity employer.

Locations

Team Engineering, Global

Salary

Estimated Salary Rangehigh confidence

95,000 - 165,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Generative AI model architectures (Diffusion, GANs, Transformers)intermediate
PyTorch proficiency and performance optimizationintermediate
Distributed training across GPU clustersintermediate
Cloud infrastructure (AWS preferred)intermediate
MLLM and agentic AI systemsintermediate
Inference-time alignment techniques (Prompt Engineering, RAG, ICL)intermediate
Data-driven evaluation methodologiesintermediate
Model benchmarking and analysisintermediate
Code review and disciplined engineering practicesintermediate
Multimodal evaluation systemsintermediate
Collaborative problem-solving in research teamsintermediate
Translating research into production systemsintermediate
Design quality assessment frameworksintermediate
Human alignment evaluation strategiesintermediate
Scalable automation engineeringintermediate

Required Qualifications

Strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures (experience)
Practical experience creating data-driven evaluation methodologies for AI models (experience)
Experience managing or optimizing large-scale distributed model training across hundreds of GPUs (experience)
Solid understanding of machine learning with hands-on experience using PyTorch and code optimization for speed (experience)
Disciplined coding practices including experience with code reviews and pull requests (experience)
Experience working in cloud environments, ideally AWS (experience)
Proven ability to analyze complex data and provide actionable insights (experience)

Preferred Qualifications

Familiarity with evaluation libraries and frameworks (experience)
Experience building or working with agentic AI systems or multi-agent coordination (experience)
Knowledge of data visualization tools for effective communication of findings (experience)
Background or interest in human-computer interaction, design principles, or AI ethics (experience)

Responsibilities

Design, build, and optimize infrastructure for an 'MLLM-as-a-Judge' evaluation system providing scalable, automated feedback
Implement and experiment with inference-time alignment techniques including Prompt Engineering, RAG, and ICL
Establish and manage comprehensive benchmarking processes for foundation models on design-centric tasks
Analyze evaluation data to identify model failure modes and deliver actionable recommendations
Collaborate with research scientists and ML engineers to integrate agentic evaluation into the model development lifecycle
Translate latest research in LLM evaluation and agentic AI into production-ready engineering solutions
Engineer autonomous AI agents using Multimodal Large Language Models to assess generated design quality and human alignment
Build rigorous frameworks for systematic model benchmarking and analysis
Provide rapid feedback loops to guide the evolution of Canva's generative design models
Optimize evaluation systems for speed and scalability across distributed environments

Benefits

general: Equity packages to share in Canva's success
general: Inclusive parental leave policy supporting all parents and carers
general: Annual Vibe & Thrive allowance for wellbeing, social connection, and home office setup
general: Flexible leave options empowering personal recharge and growth
general: Hybrid work model balancing collaboration and flexibility
general: Regular moments of magic, connectivity, and fun woven into Canva life
general: Access to flagship campuses in Sydney, London, and European operations
general: Opportunities to work on world-class generative AI empowering millions of designers

Target Your Resume for "Research Engineer - Evaluations" , Canva

Get personalized recommendations to optimize your resume specifically for Research Engineer - Evaluations. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Research Engineer - Evaluations" , Canva

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

CanvaDesignCountry London / United KingdomTeam EngineeringGlobalCountry London / United Kingdom

Answer 10 quick questions to check your fit for Research Engineer - Evaluations @ Canva.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

Research Engineer - Evaluations

Canva

Research Engineer - Evaluations

Canva

internshipPosted: Dec 16, 2025

Job Description

Research Engineer - Evaluations

Location: Team Engineering

Team: Country London / United Kingdom

About the Role

Key Responsibilities

Design, build, and optimize infrastructure for an 'MLLM-as-a-Judge' evaluation system providing scalable, automated feedback
Implement and experiment with inference-time alignment techniques including Prompt Engineering, RAG, and ICL
Establish and manage comprehensive benchmarking processes for foundation models on design-centric tasks
Analyze evaluation data to identify model failure modes and deliver actionable recommendations
Collaborate with research scientists and ML engineers to integrate agentic evaluation into the model development lifecycle
Translate latest research in LLM evaluation and agentic AI into production-ready engineering solutions
Engineer autonomous AI agents using Multimodal Large Language Models to assess generated design quality and human alignment
Build rigorous frameworks for systematic model benchmarking and analysis
Provide rapid feedback loops to guide the evolution of Canva's generative design models
Optimize evaluation systems for speed and scalability across distributed environments

Required Qualifications

Strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures
Practical experience creating data-driven evaluation methodologies for AI models
Experience managing or optimizing large-scale distributed model training across hundreds of GPUs
Solid understanding of machine learning with hands-on experience using PyTorch and code optimization for speed
Disciplined coding practices including experience with code reviews and pull requests
Experience working in cloud environments, ideally AWS
Proven ability to analyze complex data and provide actionable insights

Preferred Qualifications

Familiarity with evaluation libraries and frameworks
Experience building or working with agentic AI systems or multi-agent coordination
Knowledge of data visualization tools for effective communication of findings
Background or interest in human-computer interaction, design principles, or AI ethics

Required Skills

Generative AI model architectures (Diffusion, GANs, Transformers)
PyTorch proficiency and performance optimization
Distributed training across GPU clusters
Cloud infrastructure (AWS preferred)
MLLM and agentic AI systems
Inference-time alignment techniques (Prompt Engineering, RAG, ICL)
Data-driven evaluation methodologies
Model benchmarking and analysis
Code review and disciplined engineering practices
Multimodal evaluation systems
Collaborative problem-solving in research teams
Translating research into production systems
Design quality assessment frameworks
Human alignment evaluation strategies
Scalable automation engineering

Benefits

Equity packages to share in Canva's success
Inclusive parental leave policy supporting all parents and carers
Annual Vibe & Thrive allowance for wellbeing, social connection, and home office setup
Flexible leave options empowering personal recharge and growth
Hybrid work model balancing collaboration and flexibility
Regular moments of magic, connectivity, and fun woven into Canva life
Access to flagship campuses in Sydney, London, and European operations
Opportunities to work on world-class generative AI empowering millions of designers

Canva is an equal opportunity employer.

Locations

Team Engineering, Global

Salary

Estimated Salary Rangehigh confidence

95,000 - 165,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Generative AI model architectures (Diffusion, GANs, Transformers)intermediate
PyTorch proficiency and performance optimizationintermediate
Distributed training across GPU clustersintermediate
Cloud infrastructure (AWS preferred)intermediate
MLLM and agentic AI systemsintermediate
Inference-time alignment techniques (Prompt Engineering, RAG, ICL)intermediate
Data-driven evaluation methodologiesintermediate
Model benchmarking and analysisintermediate
Code review and disciplined engineering practicesintermediate
Multimodal evaluation systemsintermediate
Collaborative problem-solving in research teamsintermediate
Translating research into production systemsintermediate
Design quality assessment frameworksintermediate
Human alignment evaluation strategiesintermediate
Scalable automation engineeringintermediate

Required Qualifications

Strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures (experience)
Practical experience creating data-driven evaluation methodologies for AI models (experience)
Experience managing or optimizing large-scale distributed model training across hundreds of GPUs (experience)
Solid understanding of machine learning with hands-on experience using PyTorch and code optimization for speed (experience)
Disciplined coding practices including experience with code reviews and pull requests (experience)
Experience working in cloud environments, ideally AWS (experience)
Proven ability to analyze complex data and provide actionable insights (experience)

Preferred Qualifications

Familiarity with evaluation libraries and frameworks (experience)
Experience building or working with agentic AI systems or multi-agent coordination (experience)
Knowledge of data visualization tools for effective communication of findings (experience)
Background or interest in human-computer interaction, design principles, or AI ethics (experience)

Responsibilities

Design, build, and optimize infrastructure for an 'MLLM-as-a-Judge' evaluation system providing scalable, automated feedback
Implement and experiment with inference-time alignment techniques including Prompt Engineering, RAG, and ICL
Establish and manage comprehensive benchmarking processes for foundation models on design-centric tasks
Analyze evaluation data to identify model failure modes and deliver actionable recommendations
Collaborate with research scientists and ML engineers to integrate agentic evaluation into the model development lifecycle
Translate latest research in LLM evaluation and agentic AI into production-ready engineering solutions
Engineer autonomous AI agents using Multimodal Large Language Models to assess generated design quality and human alignment
Build rigorous frameworks for systematic model benchmarking and analysis
Provide rapid feedback loops to guide the evolution of Canva's generative design models
Optimize evaluation systems for speed and scalability across distributed environments

Benefits

general: Equity packages to share in Canva's success
general: Inclusive parental leave policy supporting all parents and carers
general: Annual Vibe & Thrive allowance for wellbeing, social connection, and home office setup
general: Flexible leave options empowering personal recharge and growth
general: Hybrid work model balancing collaboration and flexibility
general: Regular moments of magic, connectivity, and fun woven into Canva life
general: Access to flagship campuses in Sydney, London, and European operations
general: Opportunities to work on world-class generative AI empowering millions of designers

Target Your Resume for "Research Engineer - Evaluations" , Canva

Get personalized recommendations to optimize your resume specifically for Research Engineer - Evaluations. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Research Engineer - Evaluations" , Canva

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

CanvaDesignCountry London / United KingdomTeam EngineeringGlobalCountry London / United Kingdom

Answer 10 quick questions to check your fit for Research Engineer - Evaluations @ Canva.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap