The Cloud Engineer will build, operate, and improve Kubernetesâbased platforms supporting Rhapsodyâs cloud services. You will own cluster reliability, GitOpsâdriven deployments (Argo CD or similar), infrastructureâasâcode with Terraform modules, and production monitoring using Grafanaâstyle dashboards. Youâll collaborate with SRE, Security, and Engineering to deliver resilient, observable, and costâaware services in a 24Ă7 environment.
Key Responsibilities
- Operate and harden Kubernetes clusters: upgrade/patch, node pools, CNI, ingress, certificates, autoscaling, quotas, RBAC, and multiâenv promotion.
- Implement and maintain GitOps workflows using Argo CD (or Flux): app definitions, health policies, sync strategies, drift detection, rollback.
- Standardize platform addâons via Helm/Kustomize (ingress, cert manager, secrets, log/metrics/traces agents).
- Build reusable Terraform modules (networking, cluster, storage, identity, observability) and enforce plan/apply and codeâreview workflows.
- Create Python/Shell automation for cluster operations, validations, drift remediation, image promotion, capacity, and cost hygiene.
- Develop and tune Grafanaâstyle dashboards and alerts; reduce noise, improve MTTR, and document RCAs.
- Apply leastâprivilege, secrets hygiene, image provenance, and policy controls; execute maintenance windows, patching, and upgrades.
- Keep runbooks/diagrams/SOPs current; contribute to knowledge base and mentor junior engineers.
- Collaborate with internal/external stakeholders during deployments, cutovers, and incidents; communicate tradeâoffs and status clearly.