The CoreAI Speech Group brings together talents in the areas of signal processing, speech modeling, statistical modeling and deep learning to develop and deliver robust, natural and scalable speech technologies, across a rich set of scenarios and languages. We welcome Applied Scientists to join our cutting-edge voice Artificial Intelligence (AI) team. You’ll work at the intersection of deep learning, signal processing, and speech/audio modeling to push the boundaries of natural, expressive, and multilingual speech generation. Your innovations will power next-generation products in conversational AI and accessibility—impacting and enriching the lives of millions of users worldwide. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
Locations
Redmond, Washington, United States
Salary
119,800 - 234,700 USD / yearly
Required Qualifications
Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field (degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, 4 years)
Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field (degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, 3 years)
Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field (degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, 1 years)
presenting at conferences such as Automatic Speech Recognition and Understanding (ASRU) International Conference on Acoustics Speech and Signal Processing (ICASSP), Interspeech or other events (experience, 2 years)
Preferred Qualifications
Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field (degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, 6 years)
Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field (degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, 3 years)
conducting research as part of a research program (in academic or industry settings) (experience, 3 years)
developing and deploying products or systems at multiple points in the product cycle from ideation to shipping (experience, 1 years)
developing and deploying live production systems, as part of a product team (experience, 1 years)
presenting at conferences or other events in the outside research/industry community as an invited speaker (experience)
Research/industry community experience as an invited speaker (experience)
Responsibilities
Design and develop novel speech algorithms to advance state-of-the-art technologies, with a focus on real-world applications in speech generation
Tackle scalability challenges by aligning solutions with evolving stakeholder requirements
Leverage large-scale computing frameworks, data analysis platforms, and modeling environments to enhance model performance and efficiency
Deploy models into production environments and iterate based on empirical results and user impact
Conduct rigorous experimentation by evaluating multiple models in live scenarios to assess comparative performance
Continuously monitor deployed algorithms to ensure they meet expected behavior, accuracy thresholds, and performance guardrails
MS Culture & Values: Embody our Culture and Values.