Preference Model is building the next generation of training data to power the future of AI.
Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models for are outside of their training data distribution. Preference Model creates reinforcement learning environments that encapsulate real-world use cases, enabling AI systems to practice, adapt, and learn from feedback grounded in reality. We seek to bring the real world into distribution for the models.
We're seeking experienced ML engineers to build distributed training infrastructure for our RL training initiatives, including:
Design and implement scalable distributed training infrastructure using PyTorch and Ray
Create automation tools for monitoring, debugging, and recovering from infrastructure failures in distributed training environments
Ensure infrastructure reliability, security, and performance meet the demanding requirements of large-scale ML workloads
We're looking for candidates with the following qualifications and attributes:
Experience building and operating ML infrastructure at scale
Proficiency in PyTorch and distributed training paradigms
Hands-on experience with Ray
Experience with at least one modern RL training framework (verl, NeMo-RL, ART, Atropos, or similar)
Proficiency in Python and systems programming
Experience with container orchestration (Kubernetes), infrastructure as code (Terraform)
Strong systems thinking with the ability to design for scale
Excellent debugging skills across the entire stack
Collaborative mindset with strong communication skills to work effectively with researchers and engineers
Self-directed problem solver who takes ownership and drives solutions end-to-end
Passion for staying current with the rapidly evolving ML infrastructure landscape
Open-source ML infrastructure contributions
We value diverse perspectives and experiences. If you're excited about this role but don't check every box, we still encourage you to apply.
We are backed by a Tier 1 VC. We offer competitive base salary as well as generous equity (>90th percentile).
Preference%20model
https://preference%20model.com