How do I apply to this Deep Learning Compiler Engineer - CUDA role at NVIDIA?

Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.

What's the typical salary for Software Engineer in China?

Compensation for Software Engineer roles in China varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Software Engineer hub for China medians across recent openings.

How fast does AI Applyd auto-apply?

Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.

What ATS does NVIDIA use?

AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.

Deep Learning Compiler Engineer - CUDA at NVIDIA in Shanghai, CN

Skills

c++llmml

About the role

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are now looking for cuTile Core Compiler Architect in our group! The NVIDIA Architecture group is looking for world class architects and engineers to join and lead our various architecture efforts. A key part of NVIDIA's strength is to innovate in the graphics and parallel computing fields delivering the highest performance in the world for parallel processing algorithms. We are constantly looking for ways to improve our GPU architecture and maintain our leadership by developing new parallel programming models, new architectures and new infrastructure that is required to make this successful.

What you'll be doing:

Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures

Continuously innovate and iterate on the core architecture of the compiler to consistently optimize performance

Investigation of next-generation GPU architectures and provide solutions in the DSL and compiler stack

Performance analysis on emerging AI/LLM workloads and integrate with AI/ML frameworks

What we need to see:

Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)

2+ years of relevant work experience

Excellent C/C++ programming and software engineering skills, ACM background is a plus

Good fundamental knowledges on computer architecture

Strong ability in abstracting problems and the methodology in resolving problems

Strong compiler backgrounds including MLIR/TVM/Triton/LLVM is desired

Good knowledge of GPU architecture and fast kernel programming skills is a plus

Knowledge of LLM algorithms or a certain HPC domain is a plus

Knowledge of multi-GPU distributed communication is a plus

Excellent oral communication in English is a plus

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

Questions about this role

How do I apply to this Deep Learning Compiler Engineer - CUDA role at NVIDIA?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for Software Engineer in China?
Compensation for Software Engineer roles in China varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Software Engineer hub for China medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does NVIDIA use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.

Want AI Applyd to auto-apply to roles like this?

We tailor your resume per posting, fill the forms, and track replies for you.

Start free Report this listing