Senior/Staff ML Performance Engineer, Low-Precision Training & Model quantization

at Nuro

California (HQ), Mountain View, United States

Who We Are

Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale—empowering a safer, richer, and more connected future.

About the Role

Nuro is seeking an experienced ML Performance Engineer with deep expertise in quantized training to join our ML Infrastructure team. In this role, you will drive the adoption of state-of-the-art quantization techniques, enabling training and deployment of highly-efficient models that power the Nuro Driver™. You will help to shape the technical strategy and partner closely with research and product groups to ensure our ML infrastructure is optimized for both cutting-edge research and real-time deployment on autonomous vehicles.

About the Work

As an ML Performance Engineer for Nuro's ML Training Infrastructure you will improve model training efficiency and drive the adoption of state-of-the-art quantization and low-precision training techniques. This will include:

Staying ahead of emerging research and evaluating new methods.
Implementing quantized training methods (e.g., AWQ, AQT, GPTQ) for new and existing self-driving models.
Leading the design and implementation of efficiency initiatives for model training, including low-bit quantization, and pruning for both research and production workloads.
Championing and implementing tools and approaches to pinpoint root-causes for possible model quality and accuracy regressions when training at lower precisions.
Collaborating cross-functionally with research, infrastructure, and product teams balancing accuracy, latency, and resource constraints.

About You

3+ years of professional or research experience in ML infrastructure, distributed training, or ML systems engineering.
Hands-on experience with quantization methods, including Activation-Aware Weight Quantization (AWQ), Accurate Quantized Training (AQT), FP-8 training, or related methods.
Experience building or maintaining quantization libraries (e.g., AQT, bitsandbytes, NVIDIA Transformer Engine, DeepSpeed Compression).
Understanding of calibration and scaling strategies for quantized models to minimize accuracy loss.

Bonus Points

Advanced degree (Ph.D. or strong M.Sc. with research experience) in Computer Science, Electrical Engineering, or related fields.
Knowledge of sparse networks and complementary model compression techniques (e.g., AdaRound, BRECQ, structured pruning).
Published work or open-source contributions in quantization methods (e.g., AWQ, AQT, GPTQ, SmoothQuant, ZeroQuant).

At Nuro, your base pay is one part of your total compensation package. For this position, the reasonably expected base pay range is between $193,930 and $352,290 for the level at which this job has been scoped. Your base pay will depend on several factors, including your experience, qualifications, education, location, and skills. In the event that you are considered for a different level, a higher or lower pay range would apply. This position is also eligible for an annual performance bonus, equity, and a competitive benefits package.

At Nuro, we celebrate differences and are committed to a diverse workplace that fosters inclusion and psychological safety for all employees. Nuro is proud to be an equal opportunity employer and expressly prohibits any form of workplace discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other legally protected characteristics. #li-dnp