Must have Skills : ETL Concepts (Strong), Python (Expert), Data Warehouse Design - General Experience
Good To Have Skills : Data Modeling (Strong)
ROLE PROFILE:
Designs, builds, and optimizes scalable data infrastructure and pipelines to process and manage both structured and unstructured data, powering advanced AI solutions.
KEY RESPONSIBILITIES:
- Design, implement, and maintain robust ETL/ELT processes to ingest, clean, and transform data from multiple sources
- Process and manage complex datasets to support business and analytical needs
- Conduct regular testing and enhancement of data pipelines to improve efficiency
- Implement best practices for data validation, testing, and monitoring, proactively identifying and resolving issues to maintain data integrity
- Collaborate with software engineers and AI/ ML engineers to ensure seamless data accessibility
- Implement best practices for data governance, security, and quality assurance
- Document data architectures, lineage, and metadata to ensure transparency and reproducibility
Qualifications
- 5+ years of experience in data engineering, ETL development, or data warehousing
- Demonstrated ability to design and manage ETL/ELT processes using tools like Databricks, Airflow, Luigi, or dbt to automate data workflows and transformations
- Experience with big data architectures and optimizing data pipeline
- Deep familiarity with SQL for data manipulation and querying
- Strong coding skills in Python, and at least one other language (Scala/Java)
- Hands-on experience with cloud-based data platforms (Azure/AWS/GCP)
- Strong analytical skills for working with large-scale unstructured datasets
- Ability to diagnose data issues, optimize performance, and communicate technical solutions effectively to both technical and non-technical stakeholders
- Educational qualifications: Bachelor鈥檚 degree in Computer Science, Engineering, or related field required