RESUME AND JOB

Multimodal Large Model Algorithm Intern 106591

Tencent

Multimodal Large Model Algorithm Intern 106591

Tencent

internshipPosted: Dec 2, 2025

Job Description

Multimodal Large Model Algorithm Intern 106591

📋 Job Overview

The Multimodal Large Model Algorithm Intern role at Tencent's Technology Engineering Group involves researching and developing advanced multimodal large model technologies, focusing on cross-modal alignment and understanding to create industry-leading models. Interns will track cutting-edge algorithms, contribute to model design, training, optimization, and evaluation, and apply these innovations to business scenarios. This position is based in Singapore and offers hands-on experience in AI R&D within a collaborative, innovative environment.

📍 Location: CapitaSky, Singapore

🏢 Business Unit: TEG

📄 Full Description

Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails
Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models.
Continuously track state-of-the-art algorithms in multimodal large models, participate in the design, training, optimization, and evaluation of these models, and promote their application in business scenarios.

Who We Look For
Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields.
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc..
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training.
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python.
Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred.
Excellent learning ability, technical curiosity, and strong teamwork and communication skills.

Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Work Location: Singapore-CapitaSky

🎯 Key Responsibilities

Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
Continuously track state-of-the-art algorithms in multimodal large models
Participate in the design, training, optimization, and evaluation of these models
Promote their application in business scenarios

✅ Required Qualifications

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc.
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python

⭐ Preferred Qualifications

Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred

🛠️ Required Skills

Solid research background in multimodal understanding
Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)
Proficiency in deep learning frameworks (TensorFlow, PyTorch)
Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)
Practical experience in multi-node/multi-GPU distributed training
Strong engineering skills
Proficiency in at least one programming language (C/C++, Java, Python)
Excellent learning ability
Technical curiosity
Strong teamwork and communication skills

🎁 Benefits

Equal opportunity employer fostering diverse voices and innovation
Supportive environment to achieve individual and common goals
Work location in Singapore-CapitaSky

Locations

CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

48,000 - 72,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Solid research background in multimodal understandingintermediate
Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)intermediate
Proficiency in deep learning frameworks (TensorFlow, PyTorch)intermediate
Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)intermediate
Practical experience in multi-node/multi-GPU distributed trainingintermediate
Strong engineering skillsintermediate
Proficiency in at least one programming language (C/C++, Java, Python)intermediate
Excellent learning abilityintermediate
Technical curiosityintermediate
Strong teamwork and communication skillsintermediate

Required Qualifications

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields (experience)
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc. (experience)
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training (experience)
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python (experience)

Preferred Qualifications

Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred (experience)

Responsibilities

Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
Continuously track state-of-the-art algorithms in multimodal large models
Participate in the design, training, optimization, and evaluation of these models
Promote their application in business scenarios

Benefits

general: Equal opportunity employer fostering diverse voices and innovation
general: Supportive environment to achieve individual and common goals
general: Work location in Singapore-CapitaSky

Target Your Resume for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Get personalized recommendations to optimize your resume specifically for Multimodal Large Model Algorithm Intern 106591. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Multimodal Large Model Algorithm Intern 106591 @ Tencent.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

Multimodal Large Model Algorithm Intern 106591

Tencent

Multimodal Large Model Algorithm Intern 106591

Tencent

internshipPosted: Dec 2, 2025

Job Description

Multimodal Large Model Algorithm Intern 106591

📋 Job Overview

📍 Location: CapitaSky, Singapore

🏢 Business Unit: TEG

📄 Full Description

🎯 Key Responsibilities

Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
Continuously track state-of-the-art algorithms in multimodal large models
Participate in the design, training, optimization, and evaluation of these models
Promote their application in business scenarios

✅ Required Qualifications

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc.
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python

⭐ Preferred Qualifications

Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred

🛠️ Required Skills

Solid research background in multimodal understanding
Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)
Proficiency in deep learning frameworks (TensorFlow, PyTorch)
Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)
Practical experience in multi-node/multi-GPU distributed training
Strong engineering skills
Proficiency in at least one programming language (C/C++, Java, Python)
Excellent learning ability
Technical curiosity
Strong teamwork and communication skills

🎁 Benefits

Equal opportunity employer fostering diverse voices and innovation
Supportive environment to achieve individual and common goals
Work location in Singapore-CapitaSky

Locations

CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

48,000 - 72,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Solid research background in multimodal understandingintermediate
Familiarity with mainstream models and algorithms (e.g., CLIP, LLaVA, VALL-E)intermediate
Proficiency in deep learning frameworks (TensorFlow, PyTorch)intermediate
Knowledge of distributed training frameworks (DeepSpeed, Megatron-LM)intermediate
Practical experience in multi-node/multi-GPU distributed trainingintermediate
Strong engineering skillsintermediate
Proficiency in at least one programming language (C/C++, Java, Python)intermediate
Excellent learning abilityintermediate
Technical curiosityintermediate
Strong teamwork and communication skillsintermediate

Required Qualifications

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields (experience)
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc. (experience)
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training (experience)
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python (experience)

Preferred Qualifications

Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred (experience)

Responsibilities

Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models
Continuously track state-of-the-art algorithms in multimodal large models
Participate in the design, training, optimization, and evaluation of these models
Promote their application in business scenarios

Benefits

general: Equal opportunity employer fostering diverse voices and innovation
general: Supportive environment to achieve individual and common goals
general: Work location in Singapore-CapitaSky

Target Your Resume for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Get personalized recommendations to optimize your resume specifically for Multimodal Large Model Algorithm Intern 106591. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Multimodal Large Model Algorithm Intern 106591" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Multimodal Large Model Algorithm Intern 106591 @ Tencent.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap