We are seeking a detail-oriented and analytical Cluster Duty Engineer to join our organization in Dubai, United Arab Emirates. In this role, you will be responsible for managing and maintaining our server cluster infrastructure, ensuring optimal performance and reliability of our systems. The ideal candidate will demonstrate exceptional organizational skills and the ability to respond efficiently to technical challenges in a fast-paced environment.
- Monitor cluster systems and infrastructure performance using industry-standard monitoring tools and dashboards
- Respond promptly to system alerts and incidents, analyzing root causes and implementing effective solutions
- Perform routine maintenance tasks, including system updates, patches, and configuration management
- Troubleshoot hardware and software issues affecting cluster operations and document findings thoroughly
- Maintain detailed technical documentation of cluster configurations, procedures, and incident resolutions
- Collaborate with cross-functional teams to optimize system performance and capacity planning
- Execute backup and disaster recovery procedures to ensure data integrity and business continuity
- Conduct system diagnostics and performance analysis to identify optimization opportunities
- Escalate critical issues to senior engineering staff when necessary and provide comprehensive incident reports
- Adhere to established protocols and standard operating procedures for all cluster management activities