Resume and JobRESUME AND JOB
Tencent logo

Multimodal Large Model Algorithm Intern 106591

Tencent

Software and Technology Jobs

Multimodal Large Model Algorithm Intern 106591

internshipPosted: Dec 2, 2025

Job Description

Multimodal Large Model Algorithm Intern 106591

šŸ“‹ Job Overview

The Multimodal Large Model Algorithm Intern role at Tencent's Technology Engineering Group involves researching and developing advanced multimodal large model technologies, focusing on cross-modal alignment and understanding to create industry-leading models. Interns will track cutting-edge algorithms, contribute to model design, training, optimization, and evaluation, and apply these innovations to business scenarios. This position is based in Singapore and offers hands-on experience in AI R&D within a collaborative, innovative environment.

šŸ“ Location: CapitaSky, Singapore

šŸ¢ Business Unit: TEG

šŸ“„ Full Description

Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails
Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models.
Continuously track state-of-the-art algorithms in multimodal large models, participate in the design, training, optimization, and evaluation of these models, and promote their application in business scenarios.

Who We Look For
Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields.
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc..
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training.
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python.
Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred.
Excellent learning ability, technical curiosity, and strong teamwork and communication skills.

Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Work Location: Singapore-CapitaSky

šŸŽÆ Key Responsibilities

  • Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
  • Continuously track state-of-the-art algorithms in multimodal large models
  • Participate in the design, training, optimization, and evaluation of these models
  • Promote their application in business scenarios

āœ… Required Qualifications

  • Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields
  • Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc.
  • Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training
  • Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python

⭐ Preferred Qualifications

  • Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred

šŸ› ļø Required Skills

  • Solid research background in multimodal understanding
  • Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)
  • Proficiency in deep learning frameworks (TensorFlow, PyTorch)
  • Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)
  • Practical experience in multi-node/multi-GPU distributed training
  • Strong engineering skills
  • Proficiency in at least one programming language (C/C++, Java, Python)
  • Excellent learning ability
  • Technical curiosity
  • Strong teamwork and communication skills

šŸŽ Benefits

  • Equal opportunity employer fostering diverse voices and innovation
  • Supportive environment to achieve individual and common goals
  • Work location in Singapore-CapitaSky

Locations

  • CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

48,000 - 72,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Solid research background in multimodal understandingintermediate
  • Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)intermediate
  • Proficiency in deep learning frameworks (TensorFlow, PyTorch)intermediate
  • Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)intermediate
  • Practical experience in multi-node/multi-GPU distributed trainingintermediate
  • Strong engineering skillsintermediate
  • Proficiency in at least one programming language (C/C++, Java, Python)intermediate
  • Excellent learning abilityintermediate
  • Technical curiosityintermediate
  • Strong teamwork and communication skillsintermediate

Required Qualifications

  • Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields (experience)
  • Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc. (experience)
  • Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training (experience)
  • Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python (experience)

Preferred Qualifications

  • Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred (experience)

Responsibilities

  • Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
  • Continuously track state-of-the-art algorithms in multimodal large models
  • Participate in the design, training, optimization, and evaluation of these models
  • Promote their application in business scenarios

Benefits

  • general: Equal opportunity employer fostering diverse voices and innovation
  • general: Supportive environment to achieve individual and common goals
  • general: Work location in Singapore-CapitaSky

Target Your Resume for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Get personalized recommendations to optimize your resume specifically for Multimodal Large Model Algorithm Intern 106591. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Multimodal Large Model Algorithm Intern 106591 @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Tencent logo

Multimodal Large Model Algorithm Intern 106591

Tencent

Software and Technology Jobs

Multimodal Large Model Algorithm Intern 106591

internshipPosted: Dec 2, 2025

Job Description

Multimodal Large Model Algorithm Intern 106591

šŸ“‹ Job Overview

The Multimodal Large Model Algorithm Intern role at Tencent's Technology Engineering Group involves researching and developing advanced multimodal large model technologies, focusing on cross-modal alignment and understanding to create industry-leading models. Interns will track cutting-edge algorithms, contribute to model design, training, optimization, and evaluation, and apply these innovations to business scenarios. This position is based in Singapore and offers hands-on experience in AI R&D within a collaborative, innovative environment.

šŸ“ Location: CapitaSky, Singapore

šŸ¢ Business Unit: TEG

šŸ“„ Full Description

Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails
Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models.
Continuously track state-of-the-art algorithms in multimodal large models, participate in the design, training, optimization, and evaluation of these models, and promote their application in business scenarios.

Who We Look For
Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields.
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc..
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training.
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python.
Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred.
Excellent learning ability, technical curiosity, and strong teamwork and communication skills.

Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Work Location: Singapore-CapitaSky

šŸŽÆ Key Responsibilities

  • Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
  • Continuously track state-of-the-art algorithms in multimodal large models
  • Participate in the design, training, optimization, and evaluation of these models
  • Promote their application in business scenarios

āœ… Required Qualifications

  • Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields
  • Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc.
  • Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training
  • Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python

⭐ Preferred Qualifications

  • Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred

šŸ› ļø Required Skills

  • Solid research background in multimodal understanding
  • Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)
  • Proficiency in deep learning frameworks (TensorFlow, PyTorch)
  • Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)
  • Practical experience in multi-node/multi-GPU distributed training
  • Strong engineering skills
  • Proficiency in at least one programming language (C/C++, Java, Python)
  • Excellent learning ability
  • Technical curiosity
  • Strong teamwork and communication skills

šŸŽ Benefits

  • Equal opportunity employer fostering diverse voices and innovation
  • Supportive environment to achieve individual and common goals
  • Work location in Singapore-CapitaSky

Locations

  • CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

48,000 - 72,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Solid research background in multimodal understandingintermediate
  • Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)intermediate
  • Proficiency in deep learning frameworks (TensorFlow, PyTorch)intermediate
  • Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)intermediate
  • Practical experience in multi-node/multi-GPU distributed trainingintermediate
  • Strong engineering skillsintermediate
  • Proficiency in at least one programming language (C/C++, Java, Python)intermediate
  • Excellent learning abilityintermediate
  • Technical curiosityintermediate
  • Strong teamwork and communication skillsintermediate

Required Qualifications

  • Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields (experience)
  • Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc. (experience)
  • Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training (experience)
  • Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python (experience)

Preferred Qualifications

  • Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred (experience)

Responsibilities

  • Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
  • Continuously track state-of-the-art algorithms in multimodal large models
  • Participate in the design, training, optimization, and evaluation of these models
  • Promote their application in business scenarios

Benefits

  • general: Equal opportunity employer fostering diverse voices and innovation
  • general: Supportive environment to achieve individual and common goals
  • general: Work location in Singapore-CapitaSky

Target Your Resume for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Get personalized recommendations to optimize your resume specifically for Multimodal Large Model Algorithm Intern 106591. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Multimodal Large Model Algorithm Intern 106591 @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.