We are looking for a Principal Data Engineer to own and build the production-grade data layer that powers a Claims AI / Intelligent Suite running on Azure Databricks. This is a hands-on role embedded in the delivery team, responsible for ingestion, transformation, storage, quality, and serving of claims-related data used by AI models and agent workflows.
You will work closely with AI Engineers, the Lead Databricks Architect, and the client’s Cloud and Platform teams to ensure data pipelines and data foundations are reliable, scalable, and ready for AI-driven workloads. This is not a BI or reporting role: the primary consumers of your work are AI systems, agents, and vector search pipelines.
• Design, build, and maintain production-grade data pipelines in Azure Databricks using Delta Live Tables and Structured Streaming.
• Implement and operate medallion architecture (bronze, silver, gold) with clear data contracts, quality controls, and freshness SLAs.
• Build and maintain scalable data models and feature tables for claims, policies, litigation, and adjuster data.
• Engineer data preparation pipelines for AI workloads, including structured data serving and unstructured document processing for vector search and RAG use cases.
• Enforce data quality, observability, and reliability through automated checks, lineage, schema enforcement, and freshness monitoring.
• Own pipeline orchestration, CI/CD, monitoring, and failure recovery for production data systems.
• Collaborate closely with AI Engineers and the Lead Databricks Architect to align data architecture with agentic AI and platform decisions.
• Work with client data owners and platform teams to manage data access, upstream changes, and source system dependencies.