RESUME AND JOB

大模型推理后台开发工程师（深圳/北京）

Tencent

大模型推理后台开发工程师（深圳/北京）

Tencent

full-timePosted: Dec 11, 2025

Job Description

大模型推理后台开发工程师（深圳/北京）

📋 Job Overview

We are seeking a Large Model Inference Backend Development Engineer to design and evolve a leading online inference platform for large models, supporting billions of daily calls with high performance, availability, and scalability for Tencent's AI business. The role involves optimizing inference service architecture by leveraging engine and hardware features to enhance scheduling, resource management, and cost efficiency. Responsibilities include developing standardized frameworks, tools, and high-availability systems to streamline the full pipeline from model development to deployment while ensuring reliability through observability and fault tolerance.

📍 Location: Shenzhen, China

🏢 Business Unit: TEG

📄 Full Description

1.负责设计与演进业界领先的大模型在线推理平台，构建支撑亿级日调用量的高性能、高可用、高扩展的服务体系，为公司AI业务提供坚实的推理能力基座；
2.负责设计高性能推理服务架构，结合推理引擎与底层硬件的核心特性，优化动态调度、资源管理等核心后台策略，实现服务性能与成本效益的最优化；
3.负责研发标准化的推理服务框架与配套工具链，打通从模型研发、性能优化到线上部署的全链路流程，提升推理服务工程化落地效率；
4.负责构建平台的高可用架构与可观测性体系，落地故障容灾、限流熔断等核心能力，为容量规划、应急响应提供数据与技术支撑，保障服务的可靠性。

🎯 Key Responsibilities

Design and evolve a leading large model online inference platform, building a high-performance, high-availability, and highly scalable service system to support billions of daily calls and provide a solid inference capability foundation for the company's AI business.
Design high-performance inference service architecture, combining core features of inference engines and underlying hardware to optimize key backend strategies such as dynamic scheduling and resource management, achieving optimal service performance and cost efficiency.
Develop standardized inference service frameworks and supporting toolchains, connecting the full pipeline from model development and performance optimization to online deployment, improving the engineering efficiency of inference service implementation.
Build the platform's high-availability architecture and observability system, implementing core capabilities like fault tolerance, rate limiting, and circuit breaking, providing data and technical support for capacity planning and emergency response to ensure service reliability.

Locations

Shenzhen, China

Salary

Estimated Salary Rangemedium confidence

300,000 - 800,000 CNY / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Responsibilities

Design and evolve a leading large model online inference platform, building a high-performance, high-availability, and highly scalable service system to support billions of daily calls and provide a solid inference capability foundation for the company's AI business.
Design high-performance inference service architecture, combining core features of inference engines and underlying hardware to optimize key backend strategies such as dynamic scheduling and resource management, achieving optimal service performance and cost efficiency.
Develop standardized inference service frameworks and supporting toolchains, connecting the full pipeline from model development and performance optimization to online deployment, improving the engineering efficiency of inference service implementation.
Build the platform's high-availability architecture and observability system, implementing core capabilities like fault tolerance, rate limiting, and circuit breaking, providing data and technical support for capacity planning and emergency response to ensure service reliability.

Target Your Resume for "大模型推理后台开发工程师（深圳/北京）" , Tencent

Get personalized recommendations to optimize your resume specifically for 大模型推理后台开发工程师（深圳/北京）. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "大模型推理后台开发工程师（深圳/北京）" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

TencentShenzhenChinaTEGTEG

Answer 10 quick questions to check your fit for 大模型推理后台开发工程师（深圳/北京） @ Tencent.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

大模型推理后台开发工程师（深圳/北京）

Tencent

大模型推理后台开发工程师（深圳/北京）

Tencent

full-timePosted: Dec 11, 2025

Job Description

大模型推理后台开发工程师（深圳/北京）

📋 Job Overview

📍 Location: Shenzhen, China

🏢 Business Unit: TEG

📄 Full Description

🎯 Key Responsibilities

Design and evolve a leading large model online inference platform, building a high-performance, high-availability, and highly scalable service system to support billions of daily calls and provide a solid inference capability foundation for the company's AI business.
Design high-performance inference service architecture, combining core features of inference engines and underlying hardware to optimize key backend strategies such as dynamic scheduling and resource management, achieving optimal service performance and cost efficiency.
Develop standardized inference service frameworks and supporting toolchains, connecting the full pipeline from model development and performance optimization to online deployment, improving the engineering efficiency of inference service implementation.
Build the platform's high-availability architecture and observability system, implementing core capabilities like fault tolerance, rate limiting, and circuit breaking, providing data and technical support for capacity planning and emergency response to ensure service reliability.

Locations

Shenzhen, China

Salary

Estimated Salary Rangemedium confidence

300,000 - 800,000 CNY / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Responsibilities

Design and evolve a leading large model online inference platform, building a high-performance, high-availability, and highly scalable service system to support billions of daily calls and provide a solid inference capability foundation for the company's AI business.
Design high-performance inference service architecture, combining core features of inference engines and underlying hardware to optimize key backend strategies such as dynamic scheduling and resource management, achieving optimal service performance and cost efficiency.
Develop standardized inference service frameworks and supporting toolchains, connecting the full pipeline from model development and performance optimization to online deployment, improving the engineering efficiency of inference service implementation.
Build the platform's high-availability architecture and observability system, implementing core capabilities like fault tolerance, rate limiting, and circuit breaking, providing data and technical support for capacity planning and emergency response to ensure service reliability.

Target Your Resume for "大模型推理后台开发工程师（深圳/北京）" , Tencent

Get personalized recommendations to optimize your resume specifically for 大模型推理后台开发工程师（深圳/北京）. Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "大模型推理后台开发工程师（深圳/北京）" , Tencent

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

TencentShenzhenChinaTEGTEG

Answer 10 quick questions to check your fit for 大模型推理后台开发工程师（深圳/北京） @ Tencent.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap