We are seeking a Senior Machine Learning Engineer / Platform Engineer to design and build a production-grade agentic workflow platform. This role sits at the intersection of LLM systems engineering, distributed platforms, and applied ML, with a strong emphasis on orchestration, reliability, and extensibility. You will be responsible for architecting and implementing agent-based workflows that integrate large language models, retrieval systems, structured knowledge, and external APIs鈥攄esigned for robustness, observability, and real-world business use.
- Design and implement multi-agent and single-agent workflows using orchestration patterns and tools, context engineering, memory management, and guardrail strategies.
- Design RAG pipelines incorporating vector search, hybrid retrieval, and citation tracking.
- Implement knowledge graph鈥揵acked reasoning, including ontologies, entity resolution and graph-based context construction.
- Design evaluation frameworks for agent task completion correctness, quality, cost, and latency.
- Develop and deploy machine learning models, focusing on production readiness, scalability, and performance.
- Collaborate with data scientists to transition experimental models into robust, production-grade applications.
- Integrate with collaboration platforms (e.g., Teams, alerting systems) for intelligent distribution of insights.
- Implement and manage CI/CD pipelines to automate deployment, testing, and monitoring of models.
- Architect and deploy systems on AWS, leveraging compute, storage and security services