Applied Science Internship - Multimodal Foundation Models & Robotics

Microsoft

internship

Posted: September 19, 2025

Number of Vacancies: 1

Job Description

Location: Zürich, SwitzerlandContract Type: InternshipDuration: 12-weeks (40hrs/week) The Spatial AI Lab is part of the Applied Sciences Group, a Microsoft research and development organization dedicated to creating next-generation human-computer interaction technologies leveraging the most recent AI developments and exploring new hardware capabilities and device form-factors. Our team of scientists and engineers has strong expertise in computer vision and multi-modal AI, with a particular focus on spatial and embodied AI. As part of our growing team, you will work alongside our researchers at the intersection of large-scale generative modeling and embodied AI, with a focus on robotics. You will be an integral part of our team’s mission of building the core intelligence for a new generation of agents, training the multimodal foundation models that empower them to perceive complex environments, reason about tasks, and act seamlessly across both the physical and digital worlds. This opportunity will allow you to gain invaluable hands-on experience in training embodied foundation models, receive mentorship from leading experts, and contribute to our pioneering research through both advancement of internal capabilities and publications. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Locations

Zürich, Zürich, Switzerland, Zürich, Zürich, Switzerland

Salary

Salary not disclosed

Required Qualifications

Currently enrolled in a Master's or PhD program in Computer Science, Robotics, Electrical Engineering, or a related technical field. (degree)
Hands-on experience with modern deep learning frameworks (e.g. Pytorch/Tensorflow/Jax). (degree)
Fluent in English (degree)
Foundation Models: hands-on training experience in at least one of the following topics: LLMs; Large vision-language models (VLMs); Video generative models and diffusion algorithms; or action-based transformers and Vision Language Action models (VLAs). (degree)
Large-Scale ML Systems: Experience with large scale machine learning compute systems. (degree)
Robotics:Hands-on training experience in robot learning techniques, such as reinforcement learning, imitation learning as well as classical control methodsSolid understanding of robot kinematics, dynamics and sensorsFamiliarity with control algorithms such as PID, model predictive control (MPC), and whole-body control. (degree)
Hands-on training experience in robot learning techniques, such as reinforcement learning, imitation learning as well as classical control methods (degree)
Solid understanding of robot kinematics, dynamics and sensors (degree)
Familiarity with control algorithms such as PID, model predictive control (MPC), and whole-body control. (degree)
Self-motivated team-player, problem solver, and keen to learn. (degree)
Bachelors (degree)

Preferred Qualifications

Foundation Models: hands-on training experience in at least one of the following topics: LLMs; Large vision-language models (VLMs); Video generative models and diffusion algorithms; or action-based transformers and Vision Language Action models (VLAs). (degree)
Large-Scale ML Systems: Experience with large scale machine learning compute systems. (degree)
Robotics:Hands-on training experience in robot learning techniques, such as reinforcement learning, imitation learning as well as classical control methodsSolid understanding of robot kinematics, dynamics and sensorsFamiliarity with control algorithms such as PID, model predictive control (MPC), and whole-body control. (degree)
Hands-on training experience in robot learning techniques, such as reinforcement learning, imitation learning as well as classical control methods (degree)
Solid understanding of robot kinematics, dynamics and sensors (degree)
Familiarity with control algorithms such as PID, model predictive control (MPC), and whole-body control. (degree)
Self-motivated team-player, problem solver, and keen to learn. (degree)

Responsibilities

Contribute to the design and implement novel AI algorithms and models for general-purpose embodied agents;
Gain hands-on experience optimizing and deploy AI models on robot hardware;
Contribute developing high-performance machine-learning pipelines and optimize data and learning stacks for scalability, efficiency, and performance.
Collaborate across Microsoft research and engineering teams to transition cutting-edge research into real-world impact.
Contribute to research that leads to publications at leading AI and robotics conferences (e.g., CoRL, RSS, NeurIPS, ICML).

Travel Requirements

3 days / week in-office

Documents

Document (url)

Privacy Terms & Conditions About Us Refund Policy Recruiter Login