Product Manager, Agent Harness
At a glance
Highlights
- Flat organization
- Small talent-dense team
- Research-adjacent product focus
Why this role might suit you
A product leader with strong technical depth and experience in AI agents or LLM‑powered developer tools will thrive shaping the Agent Harness, turning research breakthroughs into measurable product improvements for developers.
Skills
About the role
Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research, design, and engineering. Our organization is very flat, and our team is small and talent dense. We particularly like people who are truth-seeking, passionate, and creative. We enjoy spirited debate, crazy ideas, and shipping code.
ABOUT THE ROLE
The Agent Harness is what makes Cursor's agents actually work. It determines how agents decompose tasks into subtasks, how they interact with the file system and terminal, how they handle failures and retries, and how developers observe and steer what's happening. When an agent gets stuck, loops, or hallucinates, the harness is why—and the harness is how you fix it.
As a Product Manager for the Agent Harness, you will own this framework. Agent quality is improving rapidly—we shipped Composer 2, our own frontier coding model, and are training agents through real-time RL on user data. Your job is to turn those research advances into product that developers can feel.
This is not a role where you write specs and hand them off. You'll be reading agent traces, analyzing failure modes, designing evaluation frameworks, and making judgment calls about what an agent should and shouldn't attempt. You'll work at the boundary between research and product, where the roadmap is shaped by empirical results as much as customer feedback.
EXAMPLE PROJECTS INCLUDE...
- Owning the agent planning and execution framework: how agents decompose tasks, decide what tools to use, and recover when a step fails. Balancing autonomy with predictability.
- Designing how developers observe and steer agents: real-time progress, guardrails, the ability to redirect mid-task. The experience should build trust without requiring micromanagement.
- Building evaluation and benchmarking systems: defining what "good" means for agent quality—task completion rate, error recovery, hallucination frequency—and building the harnesses to measure it. These measurements drive engineering and research priorities.
- Analyzing agent traces at scale: identifying where agents get stuck, loop, hallucinate, or take unproductive paths, and turning those patterns into concrete improvements.
- Defining the primitives for agent extensibility: how agents use tools, access codebase context, call external services via MCPs and plugins on the Cursor Marketplace, and how developers customize agent behavior through rules and constraints.
- Improving the default Cursor agent experience (the “Auto” model setting): making smart model choices based on user needs, model capabilities, and cost appetite.
- Shaping multi-agent coordination: how subagents share context and avoid conflicts when executing in parallel across files and systems. This matters more as developers spin up fleets of agents simultaneously.
YOU MAY BE A FIT IF
- You have built or evaluated AI agents, LLM applications, or ML-powered developer tools.
- You're deeply technical. You're comfortable reading code, analyzing traces, and reasoning about system behavior at a low level.
- You have strong intuition for evaluation and measurement. You know how to define metrics that capture quality, not just activity.
- You can move between the big picture and the details—from "what should agents be capable of in six months?" to "why did this agent fail on this specific task?"
- You're comfortable in a research-adjacent environment where the roadmap is shaped by empirical results, not just customer requests.
- You have experience with reinforcement learning, agent frameworks, or AI evaluation—either as a practitioner or working closely with researchers.
- You thrive in ambiguous, fast-moving environments and enjoy making hard tradeoffs with incomplete information.
#LI-DNI
Questions about this role
How do I apply to this Product Manager, Agent Harness role at Cursor?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for Product Manager in United States?
Compensation for Product Manager roles in United States varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Product Manager hub for United States medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does Cursor use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.