- Design, develop, and operate scalable and maintainable data pipelines in the Azure Databricks environment
- Develop all technical artefacts as code, implemented in professional IDEs, with full version control and CI/CD automation
- Enable data-driven decision-making in Human Resources (HR), Purchasing (PUR) and Finance (FIN) by ensuring high data availability, quality, and reliability
- Implement data products and analytical assets using software engineering principles in close alignment with business domains and functional IT
- Apply rigorous software engineering practices such as modular design, test-driven development, and artifact reuse in all implementations

- Global delivery footprint; cross-functional data engineering support across HR, PUR & FIN domains
- Collaboration with business stakeholders, functional IT partners, product owners, architects, ML/AI engineers, and Power BI developers
- Agile, product-team structure embedded in an enterprise-scale Azure environment

Main Tasks:

• Design scalable batch and streaming pipelines in Azure Databricks using PySpark and/or Scala
• Implement ingestion from structured and semi-structured sources (e.g., SAP, APIs, flat files)
• Build bronze/silver/gold data layers following the defined lakehouse layering architecture & governance

• Implement use-case driven dimensional models (star/snowflake schema) tailored to HR, PUR & FIN needs
• Ensure compatibility with reporting tools (e.g., Power BI) via curated data marts and semantic models

• Implement enterprise-level data warehouse models (domain-driven 3NF models) for HR, PUR & FIN data, closely aligned with data engineers for other business domains
• Develop and apply master data management strategies (e.g., Slowly Changing Dimensions)

• Develop automated data validation tests using frameworks
• Monitor pipeline health, identify anomalies, and implement quality thresholds
• Establish data quality transparency by defining and implementing meaningful data quality rules with source system and business stakeholders and implementing related reports

• Develop and structure pipelines using modular, reusable code in a professional IDE
• Apply test-driven development (TDD) principles with automated unit, integration, and validation tests
• Integrate tests into CI/CD pipelines to enable fail-fast deployment strategies
• Commit all artifacts to version control with peer review and CI/CD integration
• Work closely with Product Owners to refine user stories and define acceptance criteria
• Translate business requirements into data contracts and technical specifications
• Participate in agile events such as sprint planning, reviews, and retrospectives

• Document pipeline logic, data contracts, and technical decisions in markdown or auto-generated docs from code
• Align designs with governance and metadata standards (e.g., Unity Catalog)
• Track lineage and audit trails through integrated tooling

• Profile and tune data transformation performance
• Reduce job execution times and optimize cluster resource usage
• Refactor legacy pipelines or inefficient transformations to improve scalability

IT engineer data lakehouse

Job Description

About continental

Similar Jobs

Internship student (MTE-SI engineer)

C&D Engineer - Automation Engineer

IT Support Specialist II

Security Analyst