Data Scientist
Skills
About the role
Overview
We’re looking for data scientists to help build the next generation of post-training methods for frontier models at Microsoft AI. You’ll join a small, high-impact team working across all stages of post-training, with a focus on evaluation design, high-quality training data, and scalable data pipelines for state-of-the-art foundation models.
In this role, you’ll help turn raw model capability into reliable, aligned, and measurable performance improvements, directly shaping how frontier models behave in real-world deployments.
About The Role
Microsoft AI is building the next generation of frontier models that power Copilot and other large-scale AI experiences. The Post-Training team is responsible for transforming powerful pretrained models into robust, aligned, and high-performing systems used by millions of people worldwide.
Our work focuses on improving general quality, instruction following, coding and math ability, tool use, agentic behaviors, personality, and other critical model capabilities. We operate across the full post-training lifecycle — from data generation and curation, to evaluation and diagnostics, to reward modeling and reinforcement learning.
We are a small, highly autonomous team that works closely with pre-training, product, and engineering partners to rapidly iterate on ideas, run large-scale experiments, and safely advance model capabilities. Each team member owns meaningful parts of the post-training pipeline and has direct access to the compute, data, and decision-making needed to move quickly from insight to production.
Microsoft Superintelligence Team
This role is part of Microsoft AI's Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being.
We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!
Responsibilities
Design evaluations of advanced model capabilities and use them to drive rapid, high-signal iteration loops
Work with vendors to produce high quality evaluation and training data
Build data pipelines to produce high quality evaluation and training data
Build data flywheels to hill-climb on model weaknesses, using data from various surfaces where our models are deployed
Ensure optimal quality, quantity and coverage of data across our post-training stages
Run post-training experiments and ablations to produce models that climb our evals
Embody our culture and values.
We’re Looking For People Who
Have deep experience with LLMs, either training them or applying them in production
Have developed production-scale data pipelines for synthesizing, curating, or processing large quantities of data
Can design, run, and interpret large-scale ML experiments with careful statistical and empirical reasoning.
Possess strong generalist engineering and mathematical skills.
Have clear written and verbal communication, and the ability to collaborate effectively with researchers, engineers and other disciplines.
Bonus skills: Demonstrated SOTA results in any area of large-scale training, inference, or evaluation.
Qualifications
Required skills
Hands‑on experience with large language models, including training or applying them in production (not just prompting)
Designing and running post‑training experiments (evals, ablations, preference tuning / RLHF‑style methods)
Building and owning scalable data pipelines for training and evaluation data
Strong Python skills for ML experimentation, data processing, and analysis
Solid statistical, experimental, and general engineering fundamentals
Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800.00 - $234,700.00 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $160,200.00 - $261,000.00 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $142,800.00 - $274,800.00 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000.00 - $304,200.00 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Compensation
This Data Scientist role pays $120k-$235k/yr. Within typical range for data scientist roles in United States.
Questions about this role
How do I apply to this Data Scientist role at Microsoft?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for Data Scientist in United States?
Compensation for Data Scientist roles in United States varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Data Scientist hub for United States medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does Microsoft use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.