Own and build the enterprise AI Operations practice, ensuring all production AI, agentic, and automation solutions are reliable, observable, well-governed, and continuously improving. This is a hands-on leadership role responsible for building the AI Operations function from the ground up, including implementing frameworks, observability, and operational playbooks. This leader defines and operationalizes AI Ops standards, frameworks, and capabilities across the organization鈥攅stablishing clear accountability, visibility, and control over AI systems at scale.
Serve as the functional leader for AI Operations, partnering closely with Product, Engineering, AI Transformation, and Delivery teams to ensure AI solutions are production-ready, resilient, and aligned to enterprise standards. Drive the implementation of governance, observability, and service management practices required to safely scale AI across the business.
This role establishes the operational foundation required to scale AI across the enterprise with confidence. By ensuring reliability, visibility, and control of AI and agentic systems in production, this leader enables widespread adoption while minimizing operational risk, controlling costs, and driving continuous improvement toward a more autonomous and efficient organization.
This position will offer flexibility for hybrid work schedules to include both in-office presence and telecommute/virtual work, to be based from either Houston or Dallas, TX.
Key Responsibilities
- Own the Enterprise AI Operations Practice End-to-End
Hold full accountability for the AI Operations strategy, operating model, standards, processes, tools, and governance frameworks鈥攄efining, implementing, and evolving them to ensure consistency, reliability, and scalability across all production AI, agentic, and automation solutions. - Drive Production Reliability, Support, and Governance
Establish and operationalize ITIL-aligned practices for incident, problem, and change management tailored to AI and agentic systems. Define enterprise SLAs, escalation paths, and support models (internal, supplier, hybrid), ensuring strong governance of agent behavior, access, and production risk across large-scale AI deployments. - Lead Observability, Optimization, and Continuous Improvement
Define and implement enterprise observability and monitoring standards for AI behavior, performance, cost, and risk. Establish proactive detection, alerting, and operational telemetry across AI systems. Identify cross-product trends and systemic issues, driving improvements in reliability, performance, and cost efficiency. - Establish Enterprise Reporting and Operational Reviews
Define standardized AI Operations metrics, scorecards, and dashboards. Publish consistent reporting on AI solution health, risks, and performance. Lead operational review forums, supplier accountability discussions, and action tracking to ensure continuous improvement. - Partner with Product, Delivery, and Technical Teams
Serve as the primary operations counterpart to AI Product Management, Engineering, AI Platforms, Architecture, and DevOps鈥攅nsuring alignment on production readiness, observability integration, risk management, and operational excellence. - Mature AI Operations as Capabilities Evolve
Continuously evolve the AI Operations practice in step with advancements in AI, agentic systems, orchestration, and automation technologies. Embed Responsible AI principles, governance, and cost optimization practices into operations. - Build AI Operations Capability Across the Organization
Develop and implement playbooks, standards, training, and communities of practice to elevate AI operations maturity. Mentor teams and review production readiness and operational strategies to ensure alignment and scalability.