Resume and JobRESUME AND JOB
Tencent logo

Large Model Speech Algorithm Principal Engineer

Tencent

Software and Technology Jobs

Large Model Speech Algorithm Principal Engineer

internshipPosted: Dec 10, 2025

Job Description

Large Model Speech Algorithm Principal Engineer

๐Ÿ“‹ Job Overview

Tencent's Technology Engineering Group (TEG) is seeking a Principal Engineer to advance speech and audio large models within their innovative R&D ecosystem. The role focuses on researching, developing, and optimizing models for speech dialogue, audio understanding, and generation to support cutting-edge applications. This position offers the opportunity to contribute to open-sourcing efforts and productization in a collaborative, technology-driven environment in Singapore.

๐Ÿ“ Location: CapitaSky, Singapore

๐Ÿข Business Unit: TEG

๐Ÿ“„ Full Description

Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails
Research and develop speech/audio large models, including but not limited to models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing) .
Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization. This includes end-to-end optimization of the full pipeline for speech dialogue products, optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music, and enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction
โ€‹

Who We Look For
Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning) is preferred
Strong coding skills and a solid foundation in data structures and algorithms. Proficiency in Python or C/C++ is required, along with familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed. Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle are advantageous
Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH is preferred
A solid background in mathematics and signal processing, good reading ability for English technical literature, strong motivation/curiosity/teamwork spirit, excellent problem-solving skills, and a passion for pursuing technological innovation

Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Work Location: Singapore-CapitaSky

๐ŸŽฏ Key Responsibilities

  • Research and develop speech/audio large models, including models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing)
  • Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
  • Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization
  • End-to-end optimization of the full pipeline for speech dialogue products
  • Optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music
  • Enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction

โœ… Required Qualifications

  • Strong coding skills and a solid foundation in data structures and algorithms
  • Proficiency in Python or C/C++
  • Familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed

โญ Preferred Qualifications

  • Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning)
  • Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle
  • Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH

๐Ÿ› ๏ธ Required Skills

  • Solid background in mathematics and signal processing
  • Good reading ability for English technical literature
  • Strong motivation/curiosity/teamwork spirit
  • Excellent problem-solving skills
  • Passion for pursuing technological innovation

๐ŸŽ Benefits

  • Equal opportunity employer fostering diverse voices and innovation
  • Supportive environment to achieve individual and common goals
  • Work location in Singapore-CapitaSky

Locations

  • CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

180,000 - 300,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Solid background in mathematics and signal processingintermediate
  • Good reading ability for English technical literatureintermediate
  • Strong motivation/curiosity/teamwork spiritintermediate
  • Excellent problem-solving skillsintermediate
  • Passion for pursuing technological innovationintermediate

Required Qualifications

  • Strong coding skills and a solid foundation in data structures and algorithms (experience)
  • Proficiency in Python or C/C++ (experience)
  • Familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed (experience)

Preferred Qualifications

  • Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning) (experience)
  • Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle (experience)
  • Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH (experience)

Responsibilities

  • Research and develop speech/audio large models, including models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing)
  • Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
  • Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization
  • End-to-end optimization of the full pipeline for speech dialogue products
  • Optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music
  • Enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction

Benefits

  • general: Equal opportunity employer fostering diverse voices and innovation
  • general: Supportive environment to achieve individual and common goals
  • general: Work location in Singapore-CapitaSky

Target Your Resume for "Large Model Speech Algorithm Principal Engineer" , Tencent

Get personalized recommendations to optimize your resume specifically for Large Model Speech Algorithm Principal Engineer. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Large Model Speech Algorithm Principal Engineer" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Large Model Speech Algorithm Principal Engineer @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Tencent logo

Large Model Speech Algorithm Principal Engineer

Tencent

Software and Technology Jobs

Large Model Speech Algorithm Principal Engineer

internshipPosted: Dec 10, 2025

Job Description

Large Model Speech Algorithm Principal Engineer

๐Ÿ“‹ Job Overview

Tencent's Technology Engineering Group (TEG) is seeking a Principal Engineer to advance speech and audio large models within their innovative R&D ecosystem. The role focuses on researching, developing, and optimizing models for speech dialogue, audio understanding, and generation to support cutting-edge applications. This position offers the opportunity to contribute to open-sourcing efforts and productization in a collaborative, technology-driven environment in Singapore.

๐Ÿ“ Location: CapitaSky, Singapore

๐Ÿข Business Unit: TEG

๐Ÿ“„ Full Description

Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails
Research and develop speech/audio large models, including but not limited to models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing) .
Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization. This includes end-to-end optimization of the full pipeline for speech dialogue products, optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music, and enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction
โ€‹

Who We Look For
Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning) is preferred
Strong coding skills and a solid foundation in data structures and algorithms. Proficiency in Python or C/C++ is required, along with familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed. Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle are advantageous
Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH is preferred
A solid background in mathematics and signal processing, good reading ability for English technical literature, strong motivation/curiosity/teamwork spirit, excellent problem-solving skills, and a passion for pursuing technological innovation

Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Work Location: Singapore-CapitaSky

๐ŸŽฏ Key Responsibilities

  • Research and develop speech/audio large models, including models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing)
  • Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
  • Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization
  • End-to-end optimization of the full pipeline for speech dialogue products
  • Optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music
  • Enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction

โœ… Required Qualifications

  • Strong coding skills and a solid foundation in data structures and algorithms
  • Proficiency in Python or C/C++
  • Familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed

โญ Preferred Qualifications

  • Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning)
  • Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle
  • Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH

๐Ÿ› ๏ธ Required Skills

  • Solid background in mathematics and signal processing
  • Good reading ability for English technical literature
  • Strong motivation/curiosity/teamwork spirit
  • Excellent problem-solving skills
  • Passion for pursuing technological innovation

๐ŸŽ Benefits

  • Equal opportunity employer fostering diverse voices and innovation
  • Supportive environment to achieve individual and common goals
  • Work location in Singapore-CapitaSky

Locations

  • CapitaSky, Singapore

Salary

Estimated Salary Rangemedium confidence

180,000 - 300,000 SGD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Solid background in mathematics and signal processingintermediate
  • Good reading ability for English technical literatureintermediate
  • Strong motivation/curiosity/teamwork spiritintermediate
  • Excellent problem-solving skillsintermediate
  • Passion for pursuing technological innovationintermediate

Required Qualifications

  • Strong coding skills and a solid foundation in data structures and algorithms (experience)
  • Proficiency in Python or C/C++ (experience)
  • Familiarity with model training frameworks like PyTorch, Megatron, or DeepSpeed (experience)

Preferred Qualifications

  • Prior experience in speech dialogue, speech synthesis, speech recognition, audio-video multimodality, or large language models (pre-training, fine-tuning, reinforcement learning) (experience)
  • Prior awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, or Kaggle (experience)
  • Having publications in top-tier conferences or journals such as NeurIPS, ICLR, ICML, ACL, CVPR, ICASSP, or INTERSPEECH (experience)

Responsibilities

  • Research and develop speech/audio large models, including models for speech dialogue (speech interaction/audio-video dialogue), audio understanding (ASR/audio captioning), and audio generation (TTS/video dubbing)
  • Be responsible for data and algorithm work related to the pre-training, post-training, and reinforcement learning (for both text and audio) of speech/audio large models
  • Oversee the open-sourcing of speech dialogue/audio understanding/audio generation models and their productization
  • End-to-end optimization of the full pipeline for speech dialogue products
  • Optimizing audio understanding in scenarios involving noise/accent/far-field/sound effects/music
  • Enhancing speech synthesis for applications like broadcasting, casual conversation, gaming, and social interaction

Benefits

  • general: Equal opportunity employer fostering diverse voices and innovation
  • general: Supportive environment to achieve individual and common goals
  • general: Work location in Singapore-CapitaSky

Target Your Resume for "Large Model Speech Algorithm Principal Engineer" , Tencent

Get personalized recommendations to optimize your resume specifically for Large Model Speech Algorithm Principal Engineer. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Large Model Speech Algorithm Principal Engineer" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentCapitaSkySingaporeTEGTEG

Answer 10 quick questions to check your fit for Large Model Speech Algorithm Principal Engineer @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.