Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, weâre helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Senior / Lead Data Engineer
We are seeking great talents for our roles - Lead Data Engineer & Senior Data Engineer to join Mastercard Foundry R&D. You will help shape our innovation roadmap by exploring new technologies and building scalable, dataâdriven prototypes and products. The ideal candidate is handsâon, curious, adaptable, and motivated to experiment and learn.
Lead Data Engineer
What You'll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer handsâon leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, wellâtested code. Introduce improvements in development processes and tooling.
* CrossâFunctional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What Youâll Bring
* Extensive Data Engineering Experience: 8â12+ years in data engineering or backend engineering, including senior/lead roles. Experience designing endâtoâend data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Handsâon work with AWS, Azure, or GCP using cloudânative processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, costâefficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learningâdataset preparation, feature/label management, and supporting realâtime or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* ProblemâSolving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelorâs degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating productionâgrade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Handsâon Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloudâbased processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with crossâfunctional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
* Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloudânative monitoring.
* DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake timeâtravel) and supporting continuous delivery for ML systems.
* Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Senior Data Engineer
What Youâll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer handsâon leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, wellâtested code. Introduce improvements in development processes and tooling.
* CrossâFunctional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What Youâll Bring
* Data Engineering Experience: Experience in data engineering or backend engineering. Experience designing endâtoâend data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production would be a plus.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Handsâon work with AWS, Azure, or GCP using cloudânative processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, costâefficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learningâdataset preparation, feature/label management, and supporting realâtime or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* ProblemâSolving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelorâs degree in Computer Science, Engineering, or related field. 5+ years of proven experience architecting and operating productionâgrade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Handsâon Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloudâbased processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with crossâfunctional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data enginee