Position Type: Full-Time, Remote
Working Hours: U.S. Client Business Hours (with flexibility for deployments, incident response, and on-call rotations)
Our client is seeking a highly skilled DevOps Engineer to build, maintain, and optimize cloud infrastructure, deployment pipelines, and system reliability across production environments.
This role requires deep expertise in cloud platforms, automation, CI/CD, container orchestration, monitoring, and infrastructure security. The DevOps Engineer will play a critical role in ensuring systems remain scalable, secure, resilient, and highly available while enabling development teams to ship code efficiently and safely.
The ideal candidate is proactive, automation-driven, calm under pressure, and passionate about improving infrastructure reliability and operational efficiency.
• Provision, configure, and maintain infrastructure on AWS, GCP, or Azure cloud platforms
• Implement Infrastructure-as-Code (IaC) using Terraform, Pulumi, or CloudFormation
• Configure networking, compute, storage, IAM, and cloud security policies
• Optimize infrastructure performance, scalability, and cost efficiency
• Build and maintain CI/CD pipelines using GitHub Actions, Jenkins, GitLab CI, or CircleCI
• Automate build, testing, deployment, and rollback workflows across environments
• Ensure zero-downtime deployments and reliable release processes
• Improve deployment speed, consistency, and operational reliability
• Manage Docker containers and Kubernetes clusters for microservices deployment
• Monitor cluster health, resource allocation, and workload performance
• Optimize orchestration strategies for scalability and reliability
• Troubleshoot container and deployment-related issues across environments
• Implement observability and monitoring solutions using Prometheus, Grafana, Datadog, or New Relic
• Configure centralized logging and alerting pipelines using ELK Stack, Splunk, or similar tools
• Participate in incident response and on-call rotations
• Perform root cause analysis (RCA) and implement preventive solutions post-incident
• Apply infrastructure security best practices including IAM, encryption, secrets management, and least-privilege access
• Support compliance requirements such as SOC 2, HIPAA, PCI-DSS, or GDPR where applicable
• Conduct vulnerability scans, patch management, and security hardening activities
• Ensure infrastructure remains secure, compliant, and audit-ready
• Partner closely with development teams to improve deployment workflows and automation
• Support developers with infrastructure troubleshooting and environment management
• Identify opportunities to improve reliability, scalability, and operational efficiency
• Maintain clear documentation for infrastructure, deployment pipelines, and operational procedures
• Strong problem solver who thrives at the intersection of development and operations
• Calm, analytical, and methodical during incidents and high-pressure situations
• Passionate about automation, infrastructure scalability, and reliability engineering
• Strong communicator who collaborates effectively across technical teams
• Proactive mindset focused on preventing issues before they impact production
• 3+ years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Engineering
• Proficiency with at least one major cloud provider (AWS, GCP, or Azure)
• Strong experience building and managing CI/CD pipelines
• Hands-on experience with Docker and Kubernetes
• Infrastructure-as-Code expertise with Terraform, Pulumi, or CloudFormation
• Experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, or New Relic
• Scripting experience with Python, Bash, or similar languages
• Strong understanding of cloud security best practices and infrastructure reliability
• Experience with microservices and distributed systems
• Familiarity with serverless technologies (AWS Lambda, Cloud Functions)
• Experience managing production-grade Kubernetes environments
• Cloud certifications such as AWS Certified DevOps Engineer, CKA, or equivalent
• Background supporting SaaS, fintech, healthcare, or enterprise applications
A DevOps Engineer’s day revolves around keeping systems secure, automated, scalable, and reliable. You will:
• Review monitoring dashboards and respond to infrastructure alerts or incidents
• Improve CI/CD pipelines to streamline testing and deployment workflows
• Provision or optimize infrastructure using Terraform or cloud-native tools
• Troubleshoot deployment issues and collaborate with developers on production releases
• Monitor Kubernetes clusters and containerized services for performance and reliability
• Document workflows, update runbooks, and improve operational processes
• Analyze logs, metrics, and incidents to proactively prevent future issues
In essence: you are responsible for ensuring infrastructure remains secure, scalable, automated, and capable of supporting fast, reliable product delivery.
• System uptime ≥ 99.9%
• Faster and more reliable deployment frequency
• Reduced MTTR (Mean Time to Recovery) during incidents
• Infrastructure cost optimization and efficiency improvements
• Deployment success rate and rollback reduction
• Positive developer feedback on infrastructure reliability and deployment speed
• Full-time remote position aligned with U.S. business hours
• Flexible schedule for deployments and incident response
• Opportunity to work with modern cloud infrastructure and cutting-edge DevOps tooling
• Exposure to complex infrastructure and scalability challenges
• Professional growth in DevOps, SRE, and cloud engineering
• Competitive compensation package
• Initial Phone Screen
• Video Interview with Pavago Recruiter
• Technical Assessment (e.g., design a CI/CD pipeline or provision infrastructure with Terraform)
• Client Interview with Engineering/DevOps Leadership
• Offer & Background Verification
#DevOps #CloudEngineering #Kubernetes #AWS #Terraform #CI_CD #Docker #SiteReliabilityEngineering #InfrastructureAutomation #RemoteJobs #DevOpsEngineer #CloudComputing