Resume and JobRESUME AND JOB
Tencent logo

元宝-多模态大模型算法研究员

Tencent

Software and Technology Jobs

元宝-多模态大模型算法研究员

full-timePosted: Nov 30, 2025

Job Description

元宝-多模态大模型算法研究员

📋 Job Overview

The role involves participating in the full-cycle R&D of audio large models, focusing on cross-modal alignment, multimodal understanding, and generation across text and speech. Responsibilities include data cleaning and preparation, model algorithm selection and optimization, and advancements in pre-training, supervised fine-tuning, and reinforcement learning. The position also entails end-to-end optimization of speech dialogue models for challenging scenarios and exploring new paradigms in multimodal model understanding and generation.

📍 Location: Beijing, China

🏢 Business Unit: CSIG

📄 Full Description

1.  参与音频大模型的全流程研发,包括跨模态对齐、多模态理解及生成,涵盖文本和语音等训练数据的清洗和制作、基础模型算法选型与优化,聚焦预训练、监督微调及强化学习等关键环节的技术迭代;
2.  负责语音对话大模型的端到端效果优化,提高在远场、低信噪比、多人、音乐等场景下的理解及生成效果,改善模型在方言、副语言信息等方面的理解能力,加强情感对话能力;
3.  探索多模态模型的理解和生成范式,跟进业界新的多模态大模型结构,从模型效果优化及降低全链路处理延时等多个方面开展前沿性研究及落地工作。

🎯 Key Responsibilities

  • Participate in the full-cycle R&D of audio large models, including cross-modal alignment, multimodal understanding and generation, covering data cleaning and production for text and speech training, foundation model algorithm selection and optimization, focusing on technical iterations in pre-training, supervised fine-tuning, and reinforcement learning.
  • Responsible for end-to-end effect optimization of speech dialogue large models, improving understanding and generation in far-field, low SNR, multi-person, music scenarios, enhancing model capabilities in dialects, paralinguistic information, and strengthening emotional dialogue abilities.
  • Explore understanding and generation paradigms for multimodal models, follow industry advancements in new multimodal large model architectures, conduct cutting-edge research and implementation in model effect optimization, reducing full-chain processing latency, and other aspects.

Locations

  • Beijing, China

Salary

Estimated Salary Rangemedium confidence

400,000 - 800,000 CNY / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Responsibilities

  • Participate in the full-cycle R&D of audio large models, including cross-modal alignment, multimodal understanding and generation, covering data cleaning and production for text and speech training, foundation model algorithm selection and optimization, focusing on technical iterations in pre-training, supervised fine-tuning, and reinforcement learning.
  • Responsible for end-to-end effect optimization of speech dialogue large models, improving understanding and generation in far-field, low SNR, multi-person, music scenarios, enhancing model capabilities in dialects, paralinguistic information, and strengthening emotional dialogue abilities.
  • Explore understanding and generation paradigms for multimodal models, follow industry advancements in new multimodal large model architectures, conduct cutting-edge research and implementation in model effect optimization, reducing full-chain processing latency, and other aspects.

Target Your Resume for "元宝-多模态大模型算法研究员" , Tencent

Get personalized recommendations to optimize your resume specifically for 元宝-多模态大模型算法研究员. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "元宝-多模态大模型算法研究员" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentBeijingChinaCSIGCSIG

Answer 10 quick questions to check your fit for 元宝-多模态大模型算法研究员 @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Tencent logo

元宝-多模态大模型算法研究员

Tencent

Software and Technology Jobs

元宝-多模态大模型算法研究员

full-timePosted: Nov 30, 2025

Job Description

元宝-多模态大模型算法研究员

📋 Job Overview

The role involves participating in the full-cycle R&D of audio large models, focusing on cross-modal alignment, multimodal understanding, and generation across text and speech. Responsibilities include data cleaning and preparation, model algorithm selection and optimization, and advancements in pre-training, supervised fine-tuning, and reinforcement learning. The position also entails end-to-end optimization of speech dialogue models for challenging scenarios and exploring new paradigms in multimodal model understanding and generation.

📍 Location: Beijing, China

🏢 Business Unit: CSIG

📄 Full Description

1.  参与音频大模型的全流程研发,包括跨模态对齐、多模态理解及生成,涵盖文本和语音等训练数据的清洗和制作、基础模型算法选型与优化,聚焦预训练、监督微调及强化学习等关键环节的技术迭代;
2.  负责语音对话大模型的端到端效果优化,提高在远场、低信噪比、多人、音乐等场景下的理解及生成效果,改善模型在方言、副语言信息等方面的理解能力,加强情感对话能力;
3.  探索多模态模型的理解和生成范式,跟进业界新的多模态大模型结构,从模型效果优化及降低全链路处理延时等多个方面开展前沿性研究及落地工作。

🎯 Key Responsibilities

  • Participate in the full-cycle R&D of audio large models, including cross-modal alignment, multimodal understanding and generation, covering data cleaning and production for text and speech training, foundation model algorithm selection and optimization, focusing on technical iterations in pre-training, supervised fine-tuning, and reinforcement learning.
  • Responsible for end-to-end effect optimization of speech dialogue large models, improving understanding and generation in far-field, low SNR, multi-person, music scenarios, enhancing model capabilities in dialects, paralinguistic information, and strengthening emotional dialogue abilities.
  • Explore understanding and generation paradigms for multimodal models, follow industry advancements in new multimodal large model architectures, conduct cutting-edge research and implementation in model effect optimization, reducing full-chain processing latency, and other aspects.

Locations

  • Beijing, China

Salary

Estimated Salary Rangemedium confidence

400,000 - 800,000 CNY / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Responsibilities

  • Participate in the full-cycle R&D of audio large models, including cross-modal alignment, multimodal understanding and generation, covering data cleaning and production for text and speech training, foundation model algorithm selection and optimization, focusing on technical iterations in pre-training, supervised fine-tuning, and reinforcement learning.
  • Responsible for end-to-end effect optimization of speech dialogue large models, improving understanding and generation in far-field, low SNR, multi-person, music scenarios, enhancing model capabilities in dialects, paralinguistic information, and strengthening emotional dialogue abilities.
  • Explore understanding and generation paradigms for multimodal models, follow industry advancements in new multimodal large model architectures, conduct cutting-edge research and implementation in model effect optimization, reducing full-chain processing latency, and other aspects.

Target Your Resume for "元宝-多模态大模型算法研究员" , Tencent

Get personalized recommendations to optimize your resume specifically for 元宝-多模态大模型算法研究员. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "元宝-多模态大模型算法研究员" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

TencentBeijingChinaCSIGCSIG

Answer 10 quick questions to check your fit for 元宝-多模态大模型算法研究员 @ Tencent.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.