Get to Know the Team
The Fulfilment Tech Family builds the systems that power Grab's marketplaces across Southeast Asia. We design real-time, distributed systems and Machine Learning (ML) solutions that process hundreds of millions of requests each day. Our work drives supply allocation, pricing, and order matching for millions of users and driver-partners.Our mission is three-fold:
- Deliver products that work for our driver-partners
- Meet consumer demand, regardless of conditions
- Build marketplaces that balance experience and cost for everyone involved
We are looking for a Senior Principal Machine Learning Engineer to lead our shift toward automated marketplace optimization. You'll advance how we use data and ML to automate pricing, dispatch, and supply management decisions.
Get to Know the Role
This is a Senior Principal individual contributor role where you'll build the foundation for autonomous, learning-driven marketplace systems. You'll work at the intersection of reinforcement learning, large language models, and production systems that operate at scale.
Your work centres on two areas:
- Reinforcement Learning (RL) Systems: You'll develop systems that jointly optimize pricing, dispatching, and supply repositioning. You'll build decision agents that handle multiple objectives and adapt when real-world conditions change.
- LLM-Based Behavioural Intelligence: You'll architect systems using fine-tuned language models to predict, explain, and simulate user decision-making at scale. These models will power the next generation of marketplace automation.
You'll serve as the technical lead for a small team, guiding both research direction and production implementation. You'll report to the Head of Data Science and work from Grab's One-North Singapore office.
The Critical Tasks You will Perform
You'll:
- Design and implement end-to-end RL systems that combine model-based RL, offline RL, simulation, and online learning into a unified training pipeline. This includes creating state representations and reward structures that balance short-term results with long-term outcomes.
- Build latent world models and marketplace state representations that capture supply-demand interactions, location-based patterns, and behavioural signals from users and drivers.
- Develop systems that optimize across multiple marketplace levers simultaneously—pricing, dispatching, and supply repositioning—to expand the set of achievable outcomes for the business.
- Create policy evaluation frameworks and establish monitoring systems that allow safe deployment of new decision-making policies in production.
- Fine-tune open-source large language models on domain-specific data to build capabilities for prediction, reasoning, and simulation within marketplace applications.
- Design and implement training strategies for language models, including supervised fine-tuning, preference-based alignment, and iterative improvement methods.
- Work with data engineers and backend engineers to integrate RL and LLM systems into real-time production environments serving millions of users.