This position requires office presence of a minimum of 5 days per week and is only located in the location(s) posted. No relocation is offered.
Join AT&T and help shape the future of communications and technology that connect the world. We value innovators who seek to explore the unknown and challenge the status quo. Bring your bold ideas and fearless spirit to redefine connectivity and transform how people share stories and experiences. At AT&T, you wonāt just imagine the futureāyouāll build it.
Sr. System EngineerĀ (AI Automation Engineer SRE Focus)
Role Overview - AI-Driven Reliability, Automation & Platform Engineering
We are seeking a Lead AI Automation Engineer with a strong Site Reliability Engineering (SRE) mindsetĀ to design, implement, and operate AI-driven automation and intelligent reliability capabilitiesĀ across missionācritical Front Office (CRM) and Back Office (Supply Chain, Logistics, and ERP) platforms.
Ā
This role sits at the intersection of AI automation, AIOps, platform reliability, and enterprise application engineering. You will leverage Generative AI, Large Language Models (LLMs), Agentic AI, and autonomous automation frameworksĀ to dramatically improve system resilience, incident response, observability, and operational efficiencyĀ across complex Oracle-based and SaaS ecosystems.
Ā
You will be accountable not just for keeping systems running, but for engineering self-healing, predictive, and continuously improving platformsĀ that reduce human toil, prevent incidents before they occur, and scale reliably as the business grows.
What Youāll Do
AI-Driven Reliability & Automation Engineering
- Architect and deliver AI-powered automation solutionsĀ for production operations, including intelligent incident triage, root cause analysis, remediation, and prevention.
- Design Agentic AI workflowsĀ that autonomously monitor systems, analyze anomalies, trigger corrective actions, and orchestrate recovery across ERP, supply chain, and integration layers.
- Apply AIOps techniquesĀ to correlate metrics, logs, events, and traces for predictive alerting, noise reduction, and proactive reliability improvements.
- Develop LLM-enabled runbooks and intelligent assistantsĀ to guide operational decision-making, accelerate incident response, and upskill operations teams.
Site Reliability Engineering (SRE) & Production Operations
- Own platform stability, uptime, and performanceĀ across Oracle EBS/ERP, Oracle Fusion Cloud, and supply chain execution systems.
- Lead incident management, coordinating rapid response, containing impact, and ensuring SLA adherence.
- Conduct blameless postmortems, using AI-assisted RCA to identify systemic issues and drive automation-first corrective actions.
- Partner with development teams to embed reliability, scalability, and observability requirementsĀ into system design and delivery.
Enterprise Application & Supply Chain Support
- Provide advanced production support for Oracle EBS/ERP modulesĀ including Procurement, Order Management, Inventory, AR, AP, FA, Project Accounting, and Supply Chain Planning.
- Support end-to-end supply chain flowsĀ including Procure-to-Pay, Order-to-Cash, inventory transactions, fulfillment, shipping, and reconciliation processes.
- Troubleshoot complex issues across configuration, master data, transactions, batch jobs, interfaces, and integrations, leveraging deep SQL and system-level analysis.
- Monitor and support 3rd-party platformsĀ (O9, Blue Yonder/JDA, RELEX) and integrations with WMS, 3PL, and logistics providers.
Observability, Monitoring & Intelligence
- Build and evolve AI-augmented observability solutionsĀ using tools such as Dynatrace, AppDynamics, Splunk, ELK, Grafana, and custom ML models.
- Implement predictive health monitoring, capacity forecasting, and intelligent service-level indicators (SLIs/SLOs).
- Replace static alerts with context-aware, AI-ranked alertsĀ that reduce noise and accelerate resolution.
- Create autonomous dashboardsĀ that surface actionable insights rather than raw metrics.
Integration & Automation Excellence
- Diagnose and remediate integration failuresĀ across Oracle SOA/OIC, MuleSoft, Kafka/JMS, EDI, and event-driven architectures.
- Automate error handling, replay, deduplication, and reconciliationĀ for high-volume interfaces using AI-assisted logic.
- Collaborate with middleware, cloud, and vendor teams to resolve cross-system defects, data mismatches, latency issues, and sequencing problems.
- Continuously identify and eliminate manual operational toil through intelligent automation and self-service tooling.
Release, Cloud & Platform Engineering
- Support release management, ensuring changes meet reliability, security, and performance standards.
- Apply DevOps and SRE practicesĀ including automation-first deployments, rollback strategies, and resilience testing.
- Leverage cloud-native and containerized platformsĀ (Docker, Kubernetes, Azure) to support scalable, resilient workloads.
- Participate in on-call rotations, with a strong emphasis on automation and AI-driven reduction of recurring incidents.
What Youāll Bring
Core Experience & Mindset Requirements
- + years of experience across enterprise application engineering, SRE, and production operations, with an automation-first mindset.
- Proven experience driving AI-based automation, AIOps, or intelligent operational toolingĀ in complex enterprise environments.
- Strong ownership mentality for system reliability, performance, and customer impact.
AI, Automation & Engineering Skills
- Hands-on experience with Generative AI, LLMs, or Agentic AI frameworksĀ applied to automation, monitoring, or operations.
- Proficiency in Python, Shell scripting, SQL/PLSQL, and automation frameworks.
- Experience building AI-enhanced runbooks, chatbots, or autonomous operational workflowsĀ is highly desirable.
- Ability to translate operational patterns into repeatable, intelligent automation.
Technology Stack
- Deep experience with Oracle EBS and/or Oracle Fusion CloudĀ (AR, AP, FA, PO, INV, OM, PA, Planning).
- Strong knowledge of observability platforms: Dynatrace, AppDynamics, Splunk, ELK, Grafana.
- Experience with integration technologies: Oracle SOA/OIC, MuleSoft, Kafka/JMS, EDI.
- Familiarity with containers and cloud platformsĀ (Docker, Kubernetes, Azure).
Professional Skills
- Exceptional problem-solving, analytical, and systems-thinking abilities.
- Strong communication skills, capable of explaining complex AI-driven and technical concepts to both technical and non-technical stakeholders.
- Experience leading incidents, facilitating postmortems, and driving cultural adoption of blameless SRE principles.
Education
- Bachelorās degree in Computer Science, Engineering, Information Technology, or a related field.
Supervisor:Ā No
Our Senior System Engineering , earns betweenāÆ$143,800-$215,800 USD Annual , Not to mention all the other amazing rewards that working at AT&T offers. Individual starting salary within this range may depend on geography, experience, expertise, and education/training.āÆĀ
Joining our team comes with amazing perks and benefits:
- Medical/Dental/Vision coverageāÆĀ
- 401(k) planāÆĀ
- Tuition reimbursement programāÆĀ
- Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)āÆĀ
- Paid Parental LeaveāÆĀ
- Paid Caregiver LeaveāÆĀ
- Additional sick leave beyond what state and local law require may be available but is unprotectedāÆĀ
- Adoption ReimbursementāÆĀ
- Disability Benefits (short term and long term)āÆĀ
- Life and Accidental Death InsuranceāÆĀ
- Supplemental benefit programs: critical illness/accident hospital indemnity/group legalāÆĀ
- Employee Assistance Programs (EAP)āÆĀ
- Extensive employee wellness programsāÆĀ
- Employee discounts up to 50% off on eligible AT&T mobility plans and accessories,
- AT&T internet (and fiber where available) and AT&T phone.
#LI-Onsite ā Full-time office role-
Ready to join our team? Apply today.
Weekly Hours:
40
Time Type:
Regular
Location:
Alpharetta, Georgia, Plano, Texas
Salary Range:
$128,400.00 - $215,800.00
It is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, g