Your Responsibilities:āÆ
Own platform operational stability:
- Coordinate operational health across all procurement solutions at platform level;
- Monitor platform KPIs, incidents, problems, changes, and trends to anticipate risks and steer improvements;
Lead major incidents and escalations:
- Lead platform-level P1 incidents and coordinate squads, vendors, and business stakeholders;
- Act as a central point of contact during critical outages to ensure fast recovery and business continuity;
Drive continuous improvement & risk reduction:
- Run post-incident reviews with actionable, cross-squad measures and track systemic fixes to closure;
- Organize and orchestrate the problem management within the platform;
- Build and maintain a platform risk register and prioritize remediation across squads;
Strengthen readiness and resilience:
- Lead platform readiness initiatives ahead of business-critical seasons (capacity, resilience, support coverage);
- Define and evolve observability standards, resilience patterns, and recovery readiness (incl. tests and drills);
Shape and execute platform operations strategy:
- Contribute to and execute the platform operations strategy together with the Platform Lead and the Platform Architect;
- Influence platform-level standards, DR cadence, and release governance for critical journeys;
Enable teams and collaborate across platforms:
- Coach squads in an operations assurance mindset and clarify platform vs. squad responsibilities;
- Collaborate with Operations Managers of other platforms and the operations guild to proactively prevent blockers and systemic risks.