Purpose
The Senior SRE for AWS is responsible for designing, implementing, and operating highly reliable, scalable, and secure cloud infrastructure on AWS. The role ensures the stability and performance of production systems by applying Site Reliability Engineering principles, automation, and Infrastructure as Code practices.
WHAT WILL YOU DO?
- Collaborate with architects and application engineers to ensure applications are maintainable, scalable, and follow appropriate disaster recovery
and high availability strategies - Contribute to handbooks, runbooks, and architecture design documents to ensure consistent working methods and transparency between teams
- Develop and automate standard operating procedures around common failure scenarios and manual operating tasks
- Work in scrum teams alongside architects and product managers to support and deploy new infrastructures and operational requirements
- Design, manage and maintain tools to automate manual operational processes
- Build and maintain production systems on AWS using ALB, ELB, WAF’s, S3, Serverless, API’s, Route53, etc. with the use of IaC (Terraform and
CloudFormation) - Troubleshoot problems, involving the appropriate resources and driving resolution of issues with a focus on minimizing impact to our customers.
- Leverage deep expertise to plan and lead the deployment of cloud solutions into production environments with the use of CI/CD pipelines
- Create practical demonstrations of proposed solutions and demonstrate them to other members of the team
- Contributing to the development of best practices for Infrastructure as Code, software build tools, and Continuous Integration
- Work and collaborate with multi-national teams in an international environment.