Key Responsibilities:
Leadership and Team Management:
- Lead, mentor, and develop a team of infrastructure operations engineers, fostering a high-performing culture.
- Set performance expectations, provide feedback, and manage team schedules and staffing levels.
Infrastructure Operations and Maintenance:
- Oversee day-to-day operations, maintenance, and monitoring of IT infrastructure components.
- Ensure high availability, performance, and stability of critical systems and services.
- Implement ITIL best practices and manage backup, recovery, and disaster recovery processes.
Cloud Infrastructure Management:
- Manage and optimize cloud infrastructure (e.g., AWS, Azure, GCP) for performance, cost-efficiency, and security.
Security and Compliance:
- Collaborate with the security team to implement and maintain security policies and procedures.
- Ensure compliance with relevant industry regulations and standards.
Problem Solving and Continuous Improvement:
- Lead resolution of complex infrastructure issues and conduct root cause analysis.
- Identify opportunities for process improvement and automation within infrastructure operations.
Collaboration and Communication:
- Work effectively with other IT teams and business stakeholders.
- Communicate infrastructure status, incidents, and projects to technical and non-technical audiences.
Technical Expertise:
- Demonstrate strong technical expertise across a broad range of infrastructure technologies, including server operating systems, networking, storage solutions, virtualization, cloud platforms, monitoring tools, and automation.