Preference Model is building the next generation of training data to power the future of AI.
Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models for are outside of their training data distribution. Preference Model creates reinforcement learning environments that encapsulate real-world use cases, enabling AI systems to practice, adapt, and learn from feedback grounded in reality. We seek to bring the real world into distribution for the models.
As a software engineer, you will:
Architect and build reinforcement learning environments: Design comprehensive simulation platforms including environment context, tasks, and reward functions that enable AI agents to learn and perform complex tasks
Build reinforcement learning training infrastructure: Develop scalable systems for post-training AI models, including orchestration, performance optimization, and monitoring
Create realistic model evaluations: Define assessments for AI agent performance, build infra and tooling around running evals
Shape technical strategy: Drive architectural decisions, influence product roadmaps, and play a key role in building our engineering culture as an early team member
You may be a good fit if you:
Are good at making language models do your bidding
Spike in interesting ways
Have at least 4 years of experience in software engineering, with demonstrated project ownership
Are skilled in Python, Rust, or TypeScript, with the ability to work effectively across the full stack
Have hands-on experience with modern deployment practices, containerization, and cloud infrastructure (e.g., Kubernetes, AWS or GCP)
Can demonstrate strong problem-solving abilities through algorithmic challenges or complex system design
We prefer candidates with experience in:
Machine learning infrastructure or reinforcement learning systems
Creating simulation environments or language model evaluation
Performance optimization and distributed systems
We value diverse perspectives and experiences. If you're excited about this role but don't check every box, we still encourage you to apply.
We are backed by a Tier 1 VC. We offer competitive base salary as well as generous equity (>90th percentile).
Preference%20model
https://preference%20model.com