Director, Data Center Reliability Engineering

Oracle


Job Location:

Nashville, TN - USA

Yearly Salary: $ 146300 - 306400
Posted on: 2 days ago
Vacancies: 1 Vacancy

Job Summary

Description

Key Responsibilities

  • Lead reliability engineering and analytics teams across multiple sites.
  • Standardize and enforce FMEA RCA and continuous improvement methodologies.
  • Oversee deployment of monitoring analytics and automation tools supporting reliability programs.
  • Define track and report reliability KPIs to executive and global operations leadership.
  • Ensure corrective actions are implemented verified and sustained.
  • Develop engineers and analysts in disciplined data-driven problem solving.


Ideal Candidate Profile

  • Senior experience in reliability engineering maintenance engineering or uptime-critical environments.
  • Strong background in analytics RCA rigor and reliability frameworks.


Skills and Competencies

  • Strong technical leadership and stakeholder influence.
  • Comfortable translating analysis into executive-level decisions.


Why Oracle Cloud Infrastructure

  • Global impact at scale: Contribute directly to how mission-critical OCI data centers operate across regions and continents influencing infrastructure reliability security sustainability and long-term capacity growth.
  • Technically rigorous environment: Work alongside experienced engineers automation specialists and compliance teams in a rapidly scaling hyperscale cloud infrastructure where disciplined execution and technical depth matter.
  • Culture built on operational excellence: Join an organization that values safety process rigor clear accountability and continuous improvement as foundational to protecting uptime and customer trust.
  • Long-term career development: Benefit from internal mobility role-based technical training and development opportunities designed for professionals building long-term careers in cloud infrastructure and facilities operations.


Responsibilities

Key Responsibilities
Data Center Site Portfolio Management:
-Data Center country leader and typically has responsibility for one or more sites & teams in a region.

Performance Monitoring and Analysis:
-Sets strategic direction for data center operations performance monitoring collaborates with executive leadership.
-Defines strategic direction for network performance evaluation collaborates with executive leadership.
-Establishes strategic direction for analysis of physical power and cooling capacity in collaboration with executive leadership.
-Defines the strategic direction for continuous improvement collaborates with executive leadership to achieve KPIs and objectives.

Issue Management and Automation:
-Oversees all aspects of support for escalated complex technical issues across multiple data centers.
-Defines and enforces strategies for issue triage leveraging advanced automation scheduling and monitoring tools.
-Identifies documents and standardizes issues processes and solutions ensuring the data center knowledge base is comprehensive accurate and strategically aligned with department goals.
-Oversees the implementation of strategy for incident or crisis management protocols in alignment with business continuity plans.
-Establishes best practices for conducting Root Cause Analysis (RCA) following crises or incidents and updates documentation to capture process improvements.

Data Center Expansion Support:
-Sets the strategic direction and oversees the entire process of new region builds and expansion activities both onsite and remotely.
-Acts as the primary liaison with senior project teams and data center engineering leadership organizing resources and ensuring strategic timelines and long-term capacity needs are effectively managed for all expansion projects and site builds.
-Collaborates at the highest level with project teams to ensure the delivery of world-class standards across all expansion projects and site builds.

Installation and Maintenance:
-Directs all aspects of installations repairs inventory management and logistics tasks across several data centers.
-Establishes standards and best practices for component replacements and upgrades.
-Advises on and manages large-scale purchases or upgrades for data centers.
-Ensures implementation of proactive maintenance and lifecycle management strategies of the Data Center facilities with regard to efficiency and stability (e.g. containment air flow & pressure power trains).

Core Responsibilities
Planning & Execution:
-Oversees and guides multiple teams on managing complex projects or initiatives monitoring timelines deliverables and budgets when applicable to ensure strategic objectives are met. Serves as a role model for appropriately delegating work setting priorities and ensuring alignment with business needs. Coaches others on adjusting resources or project timelines in anticipation of business changes.
Collaboration & Partnership:
-Role models leading cross-functional collaborative efforts to ensure alignment of expectations and strategic objectives. Empowers team to build and maintain partnerships with business leaders stakeholders and/or customers to address barriers and contribute to organizational success. Drives transparency and inclusivity by modeling actively seeking listening to and leveraging diverse perspectives.
Problem Solving:
-Shares problem-solving strategies across teams providing oversight on complex operational and/or technical issues as needed. Coaches teams on analyzing highly complex data and/or information to identify solutions to ambiguous issues and provides direction on identifying root causes to prevent recurrence of issues.
Continuous Learning:
-Pursues strategic learning opportunities to maintain expertise and apply best practices at the organizational level. Creates opportunities for team members and leaders to build their expertise in new areas coaching them to build innovative skills. Identifies skill gap trends across the organization and upholds a culture that places significant emphasis on sharing knowledge and pursuing learning opportunities that advance the organization. Evaluates efficiency of learning strategies and recommends adjustments as needed.
Continuous Improvement:
-Empowers team to own the development and implementation of ideas that increase the efficiency and effectiveness of processes protocols and workflows across the department. Coaches teams to gain buy-in for ideas and to seek feedback on approaches and methods for continued improvement. Prioritizes and reviews the roadmap of improvement initiatives to ensure alignment with strategic direction and maximize return on investments.
Performance and Development:
-Serves as a role model for driving performance across teams through tailored feedback and coaching in alignment with performance management processes guidelines and expectations. Drives consistency in the application of talent development procedures and socializes performance expectations across the organization. Ensures that individual development goals are aligned with organizational strategic initiatives. Collaborates with HR to implement talent strategy through hiring and promotion processes.



Qualifications
Disclaimer:

Certain U.S. based or U.S. customer or client-facing roles may be required to comply with applicable requirements such as immunization/occupational health mandates and/or drug testing requirements.

Range and benefit information provided in this posting are specific to the stated locations only

US: Hiring Range in USD from: $146300 to $306400 per annum. May be eligible for bonus equity and compensation deferral.


Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge skills experience market conditions and locations as well as reflect Oracles differing products industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following:
1. Medical dental and vision insurance including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

Career Level - M4





Required Experience:

Director

DescriptionKey ResponsibilitiesLead reliability engineering and analytics teams across multiple sites.Standardize and enforce FMEA RCA and continuous improvement methodologies.Oversee deployment of monitoring analytics and automation tools supporting reliability programs.Define track and report reli...

About Company

Company Logo

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more

View Profile View Profile