Machine Learning Engineer, Chakra

at HackerRank Careers

Hybrid in Bangalore, India

HackerRank helps companies like NVIDIA, Amazon, and Microsoft hire and upskill the next generation of developers based on skills, not pedigree. Our platform is trusted by over 2,500 of the world’s most innovative companies to build strong engineering teams ready for what’s next.

Software has entered an era where humans and AI build side by side. As this shift accelerates, the definition of strong technical talent is changing. We give companies better ways to identify and invest in next-generation skills.

People at HackerRank care deeply about the impact of their work and sweat the small details so our customers can be wildly successful with products they genuinely love to use. We move with urgency and believe great outcomes come from high standards

About the role

The developer's job is shifting from writing code to directing AI agents, and hiring needs to catch up. HackerRank has shaped how 3000+ companies identify engineering talent, with 30M+ developers assessed on our platform. Chakra is our bet on what the next generation of that looks like: an AI interviewer built for a world where the interview itself has to be as intelligent as the candidates it is evaluating.

Open Problem
An interview that thinks, listens and gets it right every time.

Running an interview is easy. Running a good one is hard.

Chakra is an AI interviewer. It holds a conversation with a candidate, asks follow-up questions, evaluates how they think, and produces a report a hiring manager can actually act on. It is to conduct interviews that are more consistent, more probing, and more fair than most human interviewers manage in practice.

Here is the problem. A great human interviewer can do this. They read the candidate. They push on the right things. They know when an answer is shallow and when it just sounds shallow. Getting a model to do that reliably is genuinely difficult. Not because the technology cannot hold a conversation. It can. The gap is in judgment. Knowing what to probe. Knowing what the answer actually reveals about the candidate. Knowing when to move on.

Now do that 200,000 times. With candidates who speak differently, think differently, and approach problems differently. Without the model drifting. Without it being gamed. Without every report reading like it was written by the same template.

That is where the field currently falls short. Closing that gap is the work.

What you'll do

Architect and develop Chakra end to end: the agent design, conversation management, real-time response evaluation, scoring methodology, and report generation.
Build the systems that ensure interview consistency at scale. Not just model capability, but the infrastructure that makes the 200,000th interview as coherent as the first.
Design evaluation and benchmarking pipelines that measure interview quality, candidate experience consistency, and report defensibility.
Build fine-tuning and RLHF workflows to push model judgment past what off-the-shelf models deliver for this specific task.
Own the quality bar. Define what a good interview looks like, instrument how well the system meets that bar, and close the gap systematically.
Work across the full stack: data pipelines, model serving, latency constraints, and the product experience the candidate actually encounters.

Who you are

You have built and shipped agentic or conversational AI systems in production, not just prototypes.
You have a strong intuition for where LLM behavior breaks down under real-world conditions and how to address it systematically.
You think in systems. The conversation architecture, the evaluation model, the serving infrastructure, and the candidate experience are one problem to you.
You care about the quality bar at the level of a user who depends on the output, not just a researcher measuring aggregate metrics.

Even better if you have

Experience building multi-turn conversational agents or interview-style AI systems.
Worked with RLHF, Constitutional AI, or preference-based fine-tuning methods.
Background in dialogue systems, conversational evaluation, or rubric-based scoring.
Publications or contributions in agentic AI, LLM reliability, or evaluation of generative systems.

You will thrive in this role if

You are energized by the full scope of a hard product problem, from model architecture through the conversation a candidate actually has.
You hold the product bar as high as the technical bar. You want to build something that works extraordinarily well for every single person who uses it.

Want to learn more about HackerRank? Check out HackerRank.com to explore our products, solutions and resources, and dive into our story and mission here.

HackerRank is a proud equal employment opportunity and affirmative action employer. We provide equal opportunity to everyone for employment based on individual performance and qualification. We never discriminate based on race, religion, national origin, gender identity or expression, sexual orientation, age, marital, veteran, or disability status. All your information will be kept confidential according to EEO guidelines.

Linkedin | X | Blog | Instagram | Life@HackerRank

Notice to prospective HackerRank job applicants:

Our Recruiters use @hackerrank.com email addresses.
We never ask for payment or credit check information to apply, interview, or work here.