Lead System Engineer (AI Automation Engineer SRE Focus)

AT&T

Not Interested
Bookmark
Report This Job

profile Job Location:

Plano, TX - USA

profile Monthly Salary: $ 158200 - 237400
Posted on: 4 days ago
Vacancies: 1 Vacancy

Job Summary

Join AT&T and help shape the future of communications and technology that connect the world. We value innovators who seek to explore the unknown and challenge the status quo. Bring your bold ideas and fearless spirit to redefine connectivity and transform how people share stories and experiences. At AT&T you wont just imagine the futureyoull build it.


Lead System Engineer(AI Automation Engineer SRE Focus)

Role Overview - AI-Driven Reliability Automation & Platform Engineering

We are seeking a Lead AI Automation Engineer with a strong Site Reliability Engineering (SRE) mindsetto design implement and operate AI-driven automation and intelligent reliability capabilitiesacross missioncritical Front Office (CRM) and Back Office (Supply Chain Logistics and ERP) platforms.

This role sits at the intersection of AI automation AIOps platform reliability and enterprise application engineering. You will leverage Generative AI Large Language Models (LLMs) Agentic AI and autonomous automation frameworksto dramatically improve system resilience incident response observability and operational efficiencyacross complex Oracle-based and SaaS ecosystems.

You will be accountable not just for keeping systems running but for engineering self-healing predictive and continuously improving platformsthat reduce human toil prevent incidents before they occur and scale reliably as the business grows.

What Youll Do

AI-Driven Reliability & Automation Engineering

  • Architect and deliver AI-powered automation solutionsfor production operations including intelligent incident triage root cause analysis remediation and prevention.
  • Design Agentic AI workflowsthat autonomously monitor systems analyze anomalies trigger corrective actions and orchestrate recovery across ERP supply chain and integration layers.
  • Apply AIOps techniquesto correlate metrics logs events and traces for predictive alerting noise reduction and proactive reliability improvements.
  • Develop LLM-enabled runbooks and intelligent assistantsto guide operational decision-making accelerate incident response and upskill operations teams.

Site Reliability Engineering (SRE) & Production Operations

  • Own platform stability uptime and performanceacross Oracle EBS/ERP Oracle Fusion Cloud and supply chain execution systems.
  • Lead incident management coordinating rapid response containing impact and ensuring SLA adherence.
  • Conduct blameless postmortems using AI-assisted RCA to identify systemic issues and drive automation-first corrective actions.
  • Partner with development teams to embed reliability scalability and observability requirementsinto system design and delivery.

Enterprise Application & Supply Chain Support

  • Provide advanced production support for Oracle EBS/ERP modulesincluding Procurement Order Management Inventory AR AP FA Project Accounting and Supply Chain Planning.
  • Support end-to-end supply chain flowsincluding Procure-to-Pay Order-to-Cash inventory transactions fulfillment shipping and reconciliation processes.
  • Troubleshoot complex issues across configuration master data transactions batch jobs interfaces and integrations leveraging deep SQL and system-level analysis.
  • Monitor and support 3rd-party platforms(O9 Blue Yonder/JDA RELEX) and integrations with WMS 3PL and logistics providers.

Observability Monitoring & Intelligence

  • Build and evolve AI-augmented observability solutionsusing tools such as Dynatrace AppDynamics Splunk ELK Grafana and custom ML models.
  • Implement predictive health monitoring capacity forecasting and intelligent service-level indicators (SLIs/SLOs).
  • Replace static alerts with context-aware AI-ranked alertsthat reduce noise and accelerate resolution.
  • Create autonomous dashboardsthat surface actionable insights rather than raw metrics.

Integration & Automation Excellence

  • Diagnose and remediate integration failuresacross Oracle SOA/OIC MuleSoft Kafka/JMS EDI and event-driven architectures.
  • Automate error handling replay deduplication and reconciliationfor high-volume interfaces using AI-assisted logic.
  • Collaborate with middleware cloud and vendor teams to resolve cross-system defects data mismatches latency issues and sequencing problems.
  • Continuously identify and eliminate manual operational toil through intelligent automation and self-service tooling.

Release Cloud & Platform Engineering

  • Support release management ensuring changes meet reliability security and performance standards.
  • Apply DevOps and SRE practicesincluding automation-first deployments rollback strategies and resilience testing.
  • Leverage cloud-native and containerized platforms(Docker Kubernetes Azure) to support scalable resilient workloads.
  • Participate in on-call rotations with a strong emphasis on automation and AI-driven reduction of recurring incidents.

What Youll Bring

Core Experience & Mindset Requirements

  • 10 years of experience across enterprise application engineering SRE and production operations with an automation-first mindset.
  • Proven experience driving AI-based automation AIOps or intelligent operational toolingin complex enterprise environments.
  • Strong ownership mentality for system reliability performance and customer impact.

AI Automation & Engineering Skills

  • Hands-on experience with Generative AI LLMs or Agentic AI frameworksapplied to automation monitoring or operations.
  • Proficiency in Python Shell scripting SQL/PLSQL and automation frameworks.
  • Experience building AI-enhanced runbooks chatbots or autonomous operational workflowsis highly desirable.
  • Ability to translate operational patterns into repeatable intelligent automation.

Technology Stack

  • Deep experience with Oracle EBS and/or Oracle Fusion Cloud(AR AP FA PO INV OM PA Planning).
  • Strong knowledge of observability platforms: Dynatrace AppDynamics Splunk ELK Grafana.
  • Experience with integration technologies: Oracle SOA/OIC MuleSoft Kafka/JMS EDI.
  • Familiarity with containers and cloud platforms(Docker Kubernetes Azure).

Professional Skills

  • Exceptional problem-solving analytical and systems-thinking abilities.
  • Strong communication skills capable of explaining complex AI-driven and technical concepts to both technical and non-technical stakeholders.
  • Experience leading incidents facilitating postmortems and driving cultural adoption of blameless SRE principles.

Education

  • Bachelors degree in Computer Science Engineering Information Technology or a related field.

Supervisor:No

This position requires office presence of a minimum of 5 days per week and is only located in the location(s) posted. No relocation is offered.

Our Lead System Engineering earns between$158200-$237400 USD Annual Not to mention all the other amazing rewards that working at AT&T offers. Individual starting salary within this range may depend on geography experience expertise and education/training.

Joining our team comes with amazing perks and benefits:

  • Medical/Dental/Vision coverage
  • 401(k) plan
  • Tuition reimbursement program
  • Paid Time Off and Holidays (based on date of hire at least 23 days of vacation each year and 9 company-designated holidays)
  • Paid Parental Leave
  • Paid Caregiver Leave
  • Additional sick leave beyond what state and local law require may be available but is unprotected
  • Adoption Reimbursement
  • Disability Benefits (short term and long term)
  • Life and Accidental Death Insurance
  • Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
  • Employee Assistance Programs (EAP)
  • Extensive employee wellness programs
  • Employee discounts up to 50% off on eligible AT&T mobility plans and accessories
  • AT&T internet (and fiber where available) and AT&T phone.

#LI-Onsite Full-time office role-

Ready to join our team Apply today.

Weekly Hours:

40

Time Type:

Regular

Location:

USA:GA:Alpharetta / 500 North Point Pkwy - Adm (owned):500 North Point Pkwy USA:TX:Plano / W Plano Pkwy - Adm:3400 W Plano Pkwy USA:WA:Bothell / 20205 North Creek Pkwy - Adm (bothell 8):20205 North Creek Pkwy

Salary Range:

$141300.00 - $237400.00

It is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age color national origin citizenship status physical or mental disability race religion creed gender sex sexual orientation gender identity and/or expression genetic information marital status status with regard to public assistance veteran status or any other characteristic protected by federal state or local addition AT&T will provide reasonable accommodations for qualified individuals with disabilities.AT&T is a fair chance employer and does not initiate a background check until an offer is made.


Required Experience:

IC

Join AT&T and help shape the future of communications and technology that connect the world. We value innovators who seek to explore the unknown and challenge the status quo. Bring your bold ideas and fearless spirit to redefine connectivity and transform how people share stories and experiences. At...
View more view more

About Company

Company Logo

At AT&T, we know connections change lives – ready to change yours? Explore our career areas and search our open jobs in telecommunications here.

View Profile View Profile