Senior Data Infrastructure Engineer
Job Description
- Manage and optimize Databricks infrastructure, including cluster lifecycle management, job scheduling, Unity Catalog administration, user permissions, and cost monitoring.
- Manage Airflow clusters deployed on Kubernetes via Helm charts, including upgrades, configuration management, and integration with ArgoCD and Vault.
- Write and maintain Terraform modules to provision and manage cloud infrastructure in a scalable, reliable, and secure manner.
- Develop and maintain data pipelines and workflows using Python, PySpark, and dbt to support data processing and analytics use cases.
- Use GitHub Copilot with custom skills and agents to improve team productivity, automate repetitive development tasks, and accelerate code reviews and documentation.
- Communicate with stakeholders across the organization to understand their data needs and translate them into infrastructure and pipeline solutions.
- Collaborate with engineering teams to ensure data quality and consistency across the organization.
- Review and approve infrastructure and pipeline changes from team members, applying runbooks and automation components.
- Participate in on-call rotation for data infrastructure and pipeline incidents. You will be expected to take all necessary actions to resolve issues and restore services to a normal state.
- Review and maintain technical documentation, keeping it up to date with current system state and runbooks.