Senior Backend Engineer
LiteLLM is the world's most popular AI Gateway, trusted by top companies like Adobe, Netflix, and NASA. Our platform empowers developers by providing secure, reliable access to LLMs and adjacent services, and we're looking for a Senior Backend Engineer to help us build rock-solid guardrails and observability tooling at scale.
About The Role
Youâll focus on owning our guardrails and logging world-class. You will be in charge of the backend code that ensures all guardrail calls are consistently logged, errors are surfaced to users (not silently swallowed), and our observability instrumentation works for real-world, high-volume traffic. Your attention to detail in areas like latency metrics, logging traceability, and backend guardrail registration will directly impact user trust in our security and compliance features.
Responsibilities
Build and scale our product, ensuring performance, reliability, and continuous improvement.
Ensure all guardrail and policy enforcement calls (e.g., applyguardrail) are properly logged and traceable through our SpendLogs and relevant database tables
Build and design CPU-level guardrails to cover common attacks on LLM API's / MCP servers / Agents
Identify and fix areas where silent failures occur in guardrail creation, registration, and policy applicationâensuring robust error handling and transparency to end users
Work with observability integrations, including Datadog, Splunk, Prometheus, and OpenTelemetry, to maintain accurate, configurable, and usable monitoring and logging for backend systems
Enhance observability integrations to work for 1B+ requests/mo., with minimal latency overhead and no memory leaks (e.g. due to cardinality of Prometheus metrics)
Collaborate cross-functionally on backend engineering priorities (performance, reliability, security)
What Weâre Looking For
Bachelorâs or Masterâs in Computer Science or related field
4+ years of experience with Python and backend frameworks (e.g. FastAPI, Flask)
Understanding of logging best practices, error handling, and secure backend development
Exposure to monitoring, logging, or metrics platforms (Datadog, Splunk, Prometheus, OpenTelemetry)
Familiarity with database integration and troubleshooting (PostgreSQL, Redis, etc.)
Driven to deliver high-quality backend code with strong guardrails, auditing, and debugging capabilities
Eagerness to tackle hard bugs and ensure system transparency for end users
Why Join LiteLLM?
High-impact, mission-critical work on the core of compliance and reliability
Contribute directly to features used by enterprise customers at global scale
Fast-paced growth environment with room for technical ownership
Competitive salary, health, dental, and vision benefits
About LiteLLM
LiteLLM (https://github.com/BerriAI/litellm) is a Python SDK and Proxy Server enabling seamless calls to 100+ LLM APIs in the OpenAI format, trusted by industry leaders worldwide.
Ready to shape the future of secure, observable AI infrastructure? Apply now!
litellm