Job Description
About Oracle Cloud:
Oracle Cloud is a comprehensive suite of cloud servicesincluding infrastructure platform and applicationsdesigned to help organizations build deploy and manage workloads securely at scale. At Oracle we are building the most intelligent future of cloud computing. Our team is composed of talented motivated and diverse individuals committed to empowering our customers to accomplish their most important missions using Oracle Cloud Fusion Applications. We center our work around our customers needs striving to continuously enhance our cloud capabilities based on their challenges.
About the Team:
Join theFusion Site Reliability Engineering Middleware (FSRE-MW)a critical group dedicated to maintaining the high availability of Oracles Cloud Fusion Applications. We minimize the frequency and duration of customer-impacting events through large-scale incident management and automation. As a team we combine the agility of a start-up with the scale and customer focus of a leading enterprise software company.
As a Principal Site Reliability Engineer you will be a key member of a high-impact team focused on the availability performance and operational excellence of Fusion SRE Middleware. You will take ownership of production environmentsincluding systems and the Fusion Middleware stackand support mission-critical business operations for Cloud Fusion role will emphasize automation and optimization of operations across multiple production environments recommending AI-driven solutions to enhance availability performance and supportability. You will harness AI-based tools and predictive analytics to proactively identify issues automate incident responses and continuously improve system resilience. Additionally you will provide escalation support for complex production problems guide junior engineers participate in major incident bridges and help build and refine processes and procedures using AI-powered insights to drive smarter data-driven decisions.
Our team is front-and-center in reducing event duration leveraging operational experience best practices and tool development to automate incident management and drive continual improvement.
About the Role:
We seek a Principal SRE to join our globally distributed team responsible for detecting triaging and mitigating service-impacting events rapidly and effectively through automation and AI-powered insights. You will be part of a regional team minimizing Fusion services downtime through exceptional incident management and system operations with a strong emphasis on scalability performance security and AI-driven this dynamic role you will gain deep insight into the inner workings of Oracle Cloud Fusion Apps using AI tools to predict identify and address potential issues before they impact services. Youll influence cross-functional leaders and drive programs that boost service availability while leveraging AI to enhance real-time decision-making and improve operational efficiency.
If youre passionate about leveraging AI to break new ground as part of an agile team we want to speak with you!
Our Values:
Oracles valuesequity inclusion respect and commitment to the greater goodare foundational to our success. We foster opportunities for learning and growth and challenge one another to build the future together. As part of our team youll join a group of hardworking and diverse professionals given the autonomy and support to do your best work in a flexible and dynamic environment.
Career Level:IC4
Employer Description:
As a world leader in cloud solutions Oracle uses tomorrows technology to tackle todays challenges. For over 40 years weve thrived by operating with integrity and partnering with industry leaders across nearly every sector.
Innovation begins with empowering everyone to contribute which is why we are committed to building an inclusive equitable workforce. Oracle careers offer global opportunities and a healthy work-life balance with competitive benefits including medical life and retirement options. We support our communities through volunteer programs and encourage a spirit of giving back.
Oracle is committed to including people with disabilities at all stages of the employment process. If you need assistance or accommodation for a disability emailor call 1 in the United States.
We are an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability protected veterans status or any other characteristic protected by law. Oracle will also consider qualified applicants with arrest and conviction records as permitted by applicable law.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration technical dependencies and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack with focus on security resiliency scale and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale capacity security performance attributes and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
Key Responsibilities:
Professional Skills Requirements:
Required Qualifications:
Experience with AI-driven Monitoring and Predictive Analytics
If youre ready to shape the future of cloud services at Oracle we want to connect!
Apply today to join our innovative team.
Career Level - IC4
Required Experience:
IC
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more