Resume and JobRESUME AND JOB
Microsoft logo

Senior Research Data Engineer: MSR AI for Science

Microsoft

Software and Technology Jobs

Senior Research Data Engineer: MSR AI for Science

full-timePosted: Oct 1, 2025

Job Description

At Microsoft Research AI for Science, we believe machine learning and artificial intelligence has the potential to transform scientific modelling and discovery crucial for solving the most pressing problems facing society including sustainable materials and discovery of new drugs.We seek a highly motivated Senior RSDE to join our Biomolecular Emulator (BioEmu) team. The BioEmu project aims to model the dynamics and function of proteins - how they change shape, bind to each other, and bind small molecules. This approach will help us to understand biological function and dysfunction on a structural level and lead to more effective and targeted drug discovery. Our BioEmu-1 model was published in Science (see our blog post for links to our open-source software and other resources and this explainer video).

Locations

  • Multiple Locations, Multiple Locations, Germany, Multiple Locations, Multiple Locations, Germany
  • Cambridge, Cambridgeshire, United Kingdom, Cambridge, Cambridgeshire, United Kingdom

Salary

Estimated Salary Rangehigh confidence

180,000 - 260,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Required Qualifications

  • PhD or equivalent experience in Computer Science, Machine Learning, Applied Mathematics, Computational Biology, or related field. (degree)
  • Strong software engineering in Python (packaging, testing, CI), with systems thinking for data‑intensive ML. (degree)
  • Deep learning experience (PyTorch/JAX/TensorFlow) and solid foundations in linear algebra, probability, and statistics. (degree)
  • Proven experience designing robust data pipelines for large‑scale ML (HPC or cloud). (degree)
  • Ability to reason about learning signal and to assess information content of real‑world scientific datasets. (degree)
  • Excellent collaboration and communication in interdisciplinary teams. (degree)
  • Hands‑on cryo‑EM experience (e.g., map reconstruction, refinement, or pipeline tooling). (degree)
  • CUDA or C++ for performance‑critical components; experience with mixed precision and memory‑efficient training. (degree)
  • Experience integrating experimental data into ML models (e.g., constraints/priors from cryo‑EM, binding assays, spectroscopy). (degree)
  • Familiarity with MD data, structure prediction systems, or protein design work-flows. (degree)
  • Experience with cost‑optimization for data collection and cloud utilization; clear track record of building reliable, maintainable research software at scale. (degree)
  • Experience with structural biology or molecular biology data/techniques (e.g., cryo‑EM, binding assays, spectroscopy, expression, sequencing) (degree)
  • Doctorate (degree)

Responsibilities

  • Data integration for structure & dynamics: Build ingestion/curation pipelines for structural/biophysical data (mmCIF/PDB, EM maps/particles, binding/biophysics, spectroscopy); implement map/volume preprocessing (e.g., resolution filtering, normalization) and alignment to model inputs/outputs.
  • Cryo‑EM expertise: Operationalize end‑to‑end flows from raw image stacks/particles to 3D maps and model‑ready tensors; interoperate with community formats (e.g., EMDB/EMPIAR, mmCIF) and link to sequences/annotations.
  • Signal & information content: Design dataset diagnostics (e.g., mutual‑information‑like measures, effective sample size, SNR proxies) to quantify what data teach the model; build active‑learning loops that maximize learning per euro of data collection time.
  • Model‑aware data services: Implement scalable, versioned data services and feature stores that feed training/evaluation; design loaders/augmentations optimized for throughput and correctness (GPU‑aware).
  • Training‑at‑scale engineering: Own distributed data pipelines and orchestration for large runs on Azure; profile and tune I/O, storage tiers, data locality, and caching; monitor cost, utilization, and failure modes.
  • Quality, governance, and reproducibility: Codify schemas/ontologies, metadata contracts, unit/integration tests, and lineage; automate validation and data drift detection; maintain documentation and examples.
  • Partner across disciplines: Work closely with ML researchers, structural biologists, and drug designers; translate experimental constraints into robust computational workflows; communicate clearly and proactively.

Travel Requirements

3 days / week in-office

Target Your Resume for "Senior Research Data Engineer: MSR AI for Science" , Microsoft

Get personalized recommendations to optimize your resume specifically for Senior Research Data Engineer: MSR AI for Science. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Research Data Engineer: MSR AI for Science" , Microsoft

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Answer 10 quick questions to check your fit for Senior Research Data Engineer: MSR AI for Science @ Microsoft.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

Microsoft logo

Senior Research Data Engineer: MSR AI for Science

Microsoft

Software and Technology Jobs

Senior Research Data Engineer: MSR AI for Science

full-timePosted: Oct 1, 2025

Job Description

At Microsoft Research AI for Science, we believe machine learning and artificial intelligence has the potential to transform scientific modelling and discovery crucial for solving the most pressing problems facing society including sustainable materials and discovery of new drugs.We seek a highly motivated Senior RSDE to join our Biomolecular Emulator (BioEmu) team. The BioEmu project aims to model the dynamics and function of proteins - how they change shape, bind to each other, and bind small molecules. This approach will help us to understand biological function and dysfunction on a structural level and lead to more effective and targeted drug discovery. Our BioEmu-1 model was published in Science (see our blog post for links to our open-source software and other resources and this explainer video).

Locations

  • Multiple Locations, Multiple Locations, Germany, Multiple Locations, Multiple Locations, Germany
  • Cambridge, Cambridgeshire, United Kingdom, Cambridge, Cambridgeshire, United Kingdom

Salary

Estimated Salary Rangehigh confidence

180,000 - 260,000 USD / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Required Qualifications

  • PhD or equivalent experience in Computer Science, Machine Learning, Applied Mathematics, Computational Biology, or related field. (degree)
  • Strong software engineering in Python (packaging, testing, CI), with systems thinking for data‑intensive ML. (degree)
  • Deep learning experience (PyTorch/JAX/TensorFlow) and solid foundations in linear algebra, probability, and statistics. (degree)
  • Proven experience designing robust data pipelines for large‑scale ML (HPC or cloud). (degree)
  • Ability to reason about learning signal and to assess information content of real‑world scientific datasets. (degree)
  • Excellent collaboration and communication in interdisciplinary teams. (degree)
  • Hands‑on cryo‑EM experience (e.g., map reconstruction, refinement, or pipeline tooling). (degree)
  • CUDA or C++ for performance‑critical components; experience with mixed precision and memory‑efficient training. (degree)
  • Experience integrating experimental data into ML models (e.g., constraints/priors from cryo‑EM, binding assays, spectroscopy). (degree)
  • Familiarity with MD data, structure prediction systems, or protein design work-flows. (degree)
  • Experience with cost‑optimization for data collection and cloud utilization; clear track record of building reliable, maintainable research software at scale. (degree)
  • Experience with structural biology or molecular biology data/techniques (e.g., cryo‑EM, binding assays, spectroscopy, expression, sequencing) (degree)
  • Doctorate (degree)

Responsibilities

  • Data integration for structure & dynamics: Build ingestion/curation pipelines for structural/biophysical data (mmCIF/PDB, EM maps/particles, binding/biophysics, spectroscopy); implement map/volume preprocessing (e.g., resolution filtering, normalization) and alignment to model inputs/outputs.
  • Cryo‑EM expertise: Operationalize end‑to‑end flows from raw image stacks/particles to 3D maps and model‑ready tensors; interoperate with community formats (e.g., EMDB/EMPIAR, mmCIF) and link to sequences/annotations.
  • Signal & information content: Design dataset diagnostics (e.g., mutual‑information‑like measures, effective sample size, SNR proxies) to quantify what data teach the model; build active‑learning loops that maximize learning per euro of data collection time.
  • Model‑aware data services: Implement scalable, versioned data services and feature stores that feed training/evaluation; design loaders/augmentations optimized for throughput and correctness (GPU‑aware).
  • Training‑at‑scale engineering: Own distributed data pipelines and orchestration for large runs on Azure; profile and tune I/O, storage tiers, data locality, and caching; monitor cost, utilization, and failure modes.
  • Quality, governance, and reproducibility: Codify schemas/ontologies, metadata contracts, unit/integration tests, and lineage; automate validation and data drift detection; maintain documentation and examples.
  • Partner across disciplines: Work closely with ML researchers, structural biologists, and drug designers; translate experimental constraints into robust computational workflows; communicate clearly and proactively.

Travel Requirements

3 days / week in-office

Target Your Resume for "Senior Research Data Engineer: MSR AI for Science" , Microsoft

Get personalized recommendations to optimize your resume specifically for Senior Research Data Engineer: MSR AI for Science. Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Senior Research Data Engineer: MSR AI for Science" , Microsoft

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Answer 10 quick questions to check your fit for Senior Research Data Engineer: MSR AI for Science @ Microsoft.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.