Role Overview
We are looking for a Big Data Engineer to join our data team and lead the development of scalable data pipelines and analytics workflows. The role focuses on processing large-scale test and experimental data using PySpark, building reliable data models, and enabling downstream analytics and visualization in Tableau.
You will work closely with sensor experts, test engineers, and infrastructure engineers to transform raw testing data into high-quality datasets that support operational monitoring, performance analysis, and decision making.
Key Responsibilities
Data Pipeline Development
- Design, build, and maintain scalable data pipelines using PySpark and Spark SQL
- Process and transform large volumes of structured and semi-structured test data
- Implement efficient ETL / ELT workflows for data ingestion, cleaning, transformation, and aggregation
- Optimize Spark jobs for performance, reliability, and cost efficiency
Data Modeling & Data Quality
- Design data models and curated datasets optimized for analytics and reporting
- Implement data validation, monitoring, and quality checks
Analytics Enablement
- Prepare analysis-ready datasets for downstream analytics
- Collaborate with analysts to support Tableau dashboards and reports
- Optimize data structures to improve query performance for visualization tools
Collaboration
- Work in a global team closely with data engineers, test engineers, and product stakeholders
- Translate analytical requirements into scalable data solutions
- Review code and contribute to best practices in data engineering