About the Role
We are looking for an experienced AWS MLOps / DevOps Engineer to design, automate, and optimize machine learning and data workflows on AWS. You will play a central role in building scalable, secure, and reliable ML infrastructure that accelerates model development and deployment across the organization.
This position is ideal for someone who combines strong AWS engineering, modern DevOps practices, and hands-on ML workflow experience.
Key Responsibilities
MLOps & Machine Learning Workflow Automation
- Build and maintain end-to-end ML pipelines using Amazon SageMaker, Step Functions, Lambda, and Glue.
- Implement model lifecycle workflows: data preprocessing, feature engineering, model training, model registry, deployment, and monitoring.
- Automate model deployment to real-time endpoints and batch systems.
- Establish model monitoring, including data drift, model drift, performance metrics, and automated retraining triggers.
AWS Cloud Engineering
- Design, automate, and maintain AWS infrastructure using AWS CloudFormation and/or Terraform.
- Build scalable data and ML environments using AWS services such as S3, ECR, ECS/EKS, Lambda, VPC, IAM, and CloudWatch.
- Build Spark-based ETL pipelines using AWS Glue, EMR, or Spark on Kubernetes.
- Ensure compliance with AWS security best practices including IAM governance and encryption.
DevOps, CI/CD & Automation
- Develop CI/CD pipelines for ML and data workflows using GitHub Actions, GitLab CI, Jenkins, or CodePipeline.
- Implement automated testing for data validation, model quality, pipeline integrity, and infrastructure deployments.
- Maintain logging, monitoring, and observability systems using CloudWatch, Prometheus/Grafana, or ELK.
Collaboration & Technical Leadership
- Work closely with Data Scientists to productionize notebooks, scripts, and models.
- Collaborate with Data Engineering to align data processing workflows with ML requirements.
- Drive best practices in automation, model deployment, and cloud architecture.