Senior Data Engineer

Amazon

full-time

Posted: September 13, 2025

Number of Vacancies: 1

Job Description

Worldwide Fulfillment by Amazon (WW FBA) empowers millions of sellers to scale globally through Amazon's leading fulfillment network. FBA sellers deliver fast, reliable Prime-eligible shipping and hassle-free returns to customers worldwide—enabling them to focus exclusively on business growth while Amazon handles operational logistics. The WW FBA Central Analytics team architects and maintains data infrastructure that delivers critical insights to WW FBA leadership. This team forms strategic partnerships across global product, program, and technology teams to unify datasets, implement self-service analytics platforms, and develop AI capabilities that transform raw data into insights.We’re looking for a Senior Data Engineer who thrives on solving hard problems, shaping new capabilities, and delivering high-quality results in a fast-paced environment. You will be at the forefront of integrating LLM-powered solutions with robust backend systems, ensuring they scale securely and reliably to serve global customers.Key job responsibilities- Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources.- Lead the strategic migration from our Redshift-centric architecture to a flexible lakehouse model. - Establish metadata management with automated data classification and lineage tracking.- Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.- Architect a centralized metrics repository that becomes the source of truth for all FBA metrics.- Implement robust data quality frameworks with staging-first policies and automated validation pipelines.- Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns.- Develop intelligent orchestration for metrics generation workflows with comprehensive audit trails.- Lead the design of semantic data models that balance analytical performance with AI retrieval requirements.- Implement cross-domain federated query capabilities with sophisticated query optimization techniques.- Architect a globally distributed vector database infrastructure capable of managing billions of embeddings with consistent sub-100ms retrieval times.- Design and implement hybrid search strategies combining dense vectors with sparse representations for optimal semantic retrieval.- Establish automated compliance validation frameworks ensuring data handling meets Amazon's security standards.

Locations

India, KA, Bengaluru, Bengaluru, KA, India

Salary

Salary not disclosed

Estimated Salary Rangehigh confidence

120,000 - 220,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

- 7+ years of data engineering experience with demonstrated expertise in distributed systems at scale.intermediate
- Deep technical knowledge of AWS data services (Glue, Kinesis, Redshift, S3, Lambda) and infrastructure-as-code.intermediate
- Proven experience designing and implementing enterprise Data Lakehouse architectures and metrics repositories.intermediate
- Experience with vector databases, embedding models, or AI-adjacent data infrastructure.intermediate

Required Qualifications

- 7+ years of data engineering experience with demonstrated expertise in distributed systems at scale. (experience, 7 years)
- Deep technical knowledge of AWS data services (Glue, Kinesis, Redshift, S3, Lambda) and infrastructure-as-code. (experience)
- Proven experience designing and implementing enterprise Data Lakehouse architectures and metrics repositories. (experience)
- Strong programming skills in Python and/or Scala with expertise in distributed data processing frameworks. (experience)
- Track record of building high-performance, scalable data pipelines supporting mission-critical business operations. (experience)
- Experience with vector databases, embedding models, or AI-adjacent data infrastructure. (experience)

Preferred Qualifications

- Experience architecting data platforms specifically optimized for large-scale AI/ML workloads. (experience)
- Hands-on experience with vector databases (Pinecone, Zilliz, Weaviate) at production scale with billions of vectors. (experience)
- Deep knowledge of LLM integration patterns, prompt engineering, and retrieval optimization techniques. (experience)
- Expertise in semantic layer design and dimensional modeling for analytical and AI applications. (experience)
- Experience with real-time streaming architectures processing millions of events per second. (experience)
- Demonstrated technical leadership driving enterprise-wide architectural initiatives. (experience)
- Strong mentorship track record with proven ability to develop technical talent. (experience)

Responsibilities

- Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources.
- Lead the strategic migration from our Redshift-centric architecture to a flexible lakehouse model.
- Establish metadata management with automated data classification and lineage tracking.
- Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.
- Architect a centralized metrics repository that becomes the source of truth for all FBA metrics.
- Implement robust data quality frameworks with staging-first policies and automated validation pipelines.
- Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns.
- Develop intelligent orchestration for metrics generation workflows with comprehensive audit trails.
- Lead the design of semantic data models that balance analytical performance with AI retrieval requirements.
- Implement cross-domain federated query capabilities with sophisticated query optimization techniques.
- Architect a globally distributed vector database infrastructure capable of managing billions of embeddings with consistent sub-100ms retrieval times.
- Design and implement hybrid search strategies combining dense vectors with sparse representations for optimal semantic retrieval.
- Establish automated compliance validation frameworks ensuring data handling meets Amazon's security standards.

Target Your Resume for "Senior Data Engineer"

Get personalized recommendations to optimize your resume specifically for Senior Data Engineer. Our AI analyzes job requirements and tailors your resume to maximize your chances.

Keyword optimization

Skills matching

Experience alignment

Check Your ATS Score for "Senior Data Engineer"

Find out how well your resume matches this job's requirements. Our Applicant Tracking System (ATS) analyzer scores your resume based on keywords, skills, and format compatibility.

Instant analysis

Detailed feedback

Improvement tips

Documents

Accommodations Information (web)

Tags & Categories

Data Science

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap