Job Title: Systems Reliability Team Manager
Duration: 9 Months (Potential Extension)
Location: Oakland California Hybrid Preferred (Open to Remote)
We are seeking an experienced Systems Reliability Team Manager to lead a team responsible for ensuring the stability performance and continuous improvement of a large-scale digital platform.
This leadership role combines product management responsibilities with Software Development Life Cycle (SDLC) oversight translating business needs into clear requirements for technology teams and ensuring the platform remains reliable and scalable. The manager will oversee system reliability operations incident management platform roadmap planning and vendor coordination while supporting the ongoing evolution of digital products and services.
The role includes leading a team focused on incident troubleshooting system performance monitoring and defect resolution while maintaining alignment with governance compliance and organizational strategy.
Lead and manage the Systems Reliability team ensuring platform performance stability and availability.
Translate business objectives into technical requirements and priorities for development and technology teams.
Oversee production and data incident management including troubleshooting and root cause analysis.
Establish and maintain platform roadmaps and strategic technology initiatives.
Coordinate cross-functional collaboration between technology teams product teams and business stakeholders.
Monitor and improve system performance reliability and operational efficiency.
Provide leadership in portfolio planning platform evolution and digital product strategy.
Manage vendor strategy and oversight to ensure service and contract performance.
Direct development of a system reliability and platform roadmap aligned with enterprise strategy and funding priorities.
Ensure compliance with change management release management and incident management policies.
Oversee system availability performance monitoring and key performance indicators (KPIs).
Manage vendor strategy and contract performance for core technology platforms.
Guide product strategy initiatives that support continuous improvement of systems and processes.
Bachelors degree or equivalent experience.
Strong portfolio management and governance experience across enterprise systems.
Demonstrated ability to translate business strategy into technical standards processes and operational accountability.
Experience managing cross-functional technical teams and system reliability initiatives.
Strong understanding of core platform and system architecture concepts including:
APIs
Messaging systems
Data storage platforms
System environments and release management
Experience overseeing incident management processes and post-incident review practices.
Knowledge of observability tools including logs metrics and traces along with alert configuration.
Experience with change management and configuration management processes aligned with governance and audit requirements.
Familiarity with CI/CD pipelines and deployment automation practices.
Understanding of security fundamentals and vulnerability remediation practices.
Ability to maintain technical documentation standards including runbooks system diagrams and knowledge base documentation.
Strong cross-team communication and collaboration skills to coordinate work across design engineering and quality teams.
Experience working with large-scale enterprise platforms or complex digital ecosystems.
Background in financial services pension administration or retirement systems is beneficial but not required.
For more details reach at
Required Experience:
Manager