Infrastructure Reliability Engineer (Operational Excellence)

STACK Infrastructure

Posted on : 22-06-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Elk Grove - USA

Monthly Salary

$ 170 - 190

Vacancy

1 Vacancy

Posted on : 22-06-2025

Job Description

Infrastructure Reliability Engineer

THE COMPANY:

STACK INFRASTRUCTURE (STACK) provides digital infrastructure to scale the worlds most innovative companies. We are an award-winning industry leader in building owning and operating highly efficient cost-effective wholesale colocation and cloud data centers. Each of our national facilities meets or exceeds the highest industry standards in all operational categories of availability security connectivity and physical resilience.

STACK offers the scale and geographic reach that rapidly growing hyperscale and enterprise companies need. The world runs on data. Data runs on STACK.

THE POSITION:

STACK is looking for an Infrastructure Reliability Engineer who will act as a key member of STACKs Critical Operations team. This position will play a vital role in ensuring the ongoing performance resiliency and evolution of infrastructure systems across STACKs portfolio. This role requires deep technical fluency in data center power and cooling systems a forensic mindset for failure analysis and a proactive approach to risk reduction.

RESPONSIBILITIES:

Lead deep-dive root cause analyses (RCAs) for critical incidentsconnecting technical failures to design process and operational contributors.
Inform and influence the design review and turnover process by identifying gaps in infrastructure handoffs system limitations or commissioning practices.
Develop system-level failure mode mitigation strategies that improve uptime performance and reduce repeat incidents.
Partner withOperations Engineering and Construction to identifydesign improvementsneeded to enhance operational reliability
Engage Original Equipment Manufacturers (OEMs) and vendors to challenge technical assumptions and advocate for long-term improvements.
Support the evolution of maintenance standards and asset strategy for high-risk or complex systems (e.g. power distribution cooling).
Collaborate with Learning and Development to enhance technical training for site teams based on lessons from event investigations.
Contribute to availability reporting event response improvement and risk trend monitoring to ensure service level agreements (SLA) commitments are met.

THE DETAILS:

Location: Chicago (CHI) or Dallas-Fort Worth (DFW)
Compensation: $170K - $190K plus 10% bonus potential
Travel: 25% domestically
Must be eligible to work in the United States
Must pass a comprehensive background screening

MUST-HAVE QUALIFICATIONS:

Bachelors degree in engineering or equivalent experience with high technical competency.
58 years of experience in critical infrastructure environments (e.g. data centers substations power generation or utility systems).
Strong technical fluency in electrical and/or mechanical systemspower distribution uninterruptible power supply (UPS) generators control systems and heating ventilation and air conditioning (HVAC).
Hands-on experience with root cause analysis and reliability methodologies (e.g. failure more and effects analysis (FMEA) revenue cycle management (RCM).
Demonstrated ability to work across disciplines to resolve complex technical issues.
Expertise with commissioning (Cx) and infrastructure design review processes.
Ability to analyze performance data and translate findings into practical improvements.

THIS MIGHT BE RIGHT FOR YOU IF:

Youre the person people call when something went wrongand you love figuring out why.
You bring rigor and precision to every failure analysis and dont settle for surface-level fixes.
You want to engineer reliability not just react to issues.
You enjoy working cross-functional and are a collaborator who builds trust and consensus.
Youre driven by impact not egoand you measure success by improved system resilience.
You thrive in the space between design intent and operational reality.

PREFERRED QUALIFICATIONS:

Experience reviewing or developing engineering specifications.
Background in vendor/OEM engagement and technical contract negotiation.
Familiarity with computerized maintenance management system (CMMS) data center infrastructure management (DCIM) or reliability-centered asset programs.
Understanding of availability metrics and SLA management frameworks.
Technical training or mentoring experience in field operations environments.

WHY STACK

We offer a competitive compensation package with strong benefits including medical dental and vision insurance a 401K program flexible spending accounts even a cell phone subsidy.
We foster a culture of appreciation including peer-to-peer recognition and rewards programs.
Fun is part of our DNA with events game nights happy hours and barbecues.
Were growing this is a great time to join and make an impact!

STACK is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race color religion sex sexual orientation gender identity and expression age national origin mental or physical disability genetic information veteran status or any other status protected by federal state or local law.

Note to external agencies: we are not accepting any blind submissions or resumes/cvs from recruitment agencies. Any candidates sent to STACK Infrastructure will not be accepted or considered as a submission without a signed agreement in place.

#LI - LW

Employment Type

Full Time

Company Industry

Key Skills

Apply Now

About Company

STACK Infrastructure

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Infrastructure Reliability Engineer (Operational Excellence)

STACK Infrastructure

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Mid-level Site Reliability Engineer (31693)

Infrastructure Architect

Finance AI Excellence Principal

Sales Engineer

Software Engineer

Analytics Engineer

Sales Engineer

Cloud Support Engineer