The Cloud Engineer ā Azure focuses on the deployment, operation, and troubleshooting of Azure cloud services with a strong emphasis on Unix/Linux systems and networking. You will build and support secure, resilient connectivity and platform services covering VNets, routing, private access, load balancing, and hybrid connectivity (VPN/ExpressRoute) while partnering with SRE, Security, Engineering, Product Support, and customer teams across US/UK/APAC time zones. Success requires handsāon Azure depth, Linux administration, IaC automation, and crisp incident handling in a 24Ć7 environment.
Key Responsibilities
- Provision, configure, and operate Azure resources: VMs/VMSS (Linux), VNets/Subnets, NSGs/ASGs, UDRs/route tables, Private Link/Private Endpoints, Application Gateway, Azure Load Balancer, Azure Firewall, Bastion, Azure DNS, Storage.
- Implement hybrid connectivity patterns: siteātoāsite VPN (IPsec/IKEv2), ExpressRoute, vWAN, and hubāandāspoke designs.
- Apply RBAC, Managed Identities, and Key Vault for secrets and certificate lifecycle.
- Build and maintain infrastructure using Terraform (azurerm) and/or Bicep, with Azure CLI and Gitābased workflows.
- Write or extend Bash/Python scripts to automate builds, validations, patching, and operational checks; contribute to reusable modules/patterns in CI/CD.
- Monitor health with Azure Monitor, Log Analytics/KQL, Application Insights, and Network Watcher (Connection Monitor, NSG flow logs, packet capture).
- Perform deep troubleshooting across Linux OS, networking (routing/NAT/DNS/TLS), private connectivity, load balancing, and platform services; create clear diagnostics and timelines.
- Coordinate maintenance windows, patching, and compliance activities; maintain auditable SOPs/runbooks/diagrams and follow change/incident/problem processes.
- Work directly with customer IT/network teams to plan connectivity (VPN/ExpressRoute), execute cutovers, and resolve issues; communicate tradeāoffs clearly.
- Collaborate with SRE/Engineering to improve observability, resiliency, and cost efficiency; assist Support with Azure/networkācentric cases.
- Participate in the global onācall rotation for P1/P2 incidents; ensure accurate ticket hygiene and clean shift handoffs.
- Contribute to postāincident reviews, knowledge base updates, and continuous improvement initiatives.