Job Description

Role Overview

We are seeking a Senior Site Reliability Engineer with strong experience in building and maintaining scalable, resilient systems. The ideal candidate will have hands-on expertise in cloud-native technologies, infrastructure as code, observability, and automation, with a focus on Google Cloud Platform (GCP).

Key Responsibilities

Ensure the stability and reliability of cloud-native applications deployed on GCP, containerized with Docker and orchestrated via Kubernetes.

Define, implement, and monitor SLOs, SLAs, and SLIs to measure system performance and user experience.

Automate infrastructure provisioning using Terraform and manage Kubernetes configurations with Kustomize and Helm.

Develop and maintain monitoring and alerting systems using Datadog and GCP-native tools.

Conduct incident analysis and postmortems to drive continuous improvement.

Collaborate with development teams to integrate reliability practices into CI/CD pipelines using GitHub Actions.

Manage and troubleshoot database systems, particularly PostgreSQL and Cassandra.

Apply networking knowledge and Linux system administration skills to troubleshoot and optimize system connectivity and performance.

About METRO/MAKRO

METRO/MAKRO

https://careers.smartrecruiters.com/METROMAKRO

Apply for this position

Site Reliability Engineer

Job Description

About METRO/MAKRO

Similar Jobs

Equipier Commercial LS Frais/Extra-Frais - CDD F/H

Équipier commercial - Saisonnier F/H