Site Reliability Engineer

F5 Networks

Job Location:

Dublin - Ireland

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

The job posting is outdated and position may be filled

Job Summary

At F5 we strive to bring a better digital world to life. Our teams empower organizations across the globe to create secure and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity from protecting consumers from fraud to enabling companies to focus on innovation.

Everything we do centers around people. That means we obsess over how to make the lives of our customers and their customers better. And it means we prioritize a diverse F5 community where each individual can thrive.

The Role
This role serves as a critical hybrid position combining the responsibilities of a skilled Technical Support Engineer with Site Reliability Engineering (SRE) principles. The ideal candidate will embrace automation observability and operational excellence to ensure the reliability scalability and performance of our AI-powered public SaaS platform. You willoperateat the intersection of system optimization and customer success applying cloud-native technologies and distributed systems methodologies to address challenges at scale.

Theroleprovidesan opportunity to run support and scale an AI Security SaaS platform engineered for running AI inference across distributed architectures. Success in this role entails strong collaboration a passion for automation and the ability to proactively improve system reliability whileassistingcustomers with complex technical inquiries.

Key Responsibilities

Proactive Monitoring Performance Optimization and Incident Management

Monitor and measure system behaviors:Ensure Service Level Objectives (SLOs) are being met through observability tools like metrics collection logging systems and distributed tracing. Apply proactive data insights to ensureoptimalsystem performance and uptime.

24/7 Support Model:Drive operational excellence tomaintainthe availability and reliability of SaaS platforms through incident management root cause resolutions postmortem authorship and service restoration processes.

Customer-Centric Incident Resolution

Act as the primary point of contact for high-priority technical inquiries and escalations.

Troubleshoot and resolve complex customer-facing issues applying technical acumen to dissect log files application traces and system metrics quickly.

Identify triage and address technical problems ensuring prompt communication and solution delivery.

Automation and Toil Reduction

Build and improve automated workflows through Infrastructure as Code (IaC) frameworks and scripting (e.g. Terraform Python).

Advocate for and lead automation initiatives across monitoring deployment processes configuration management and repetitive manual tasks ensuring greater efficiency and reliability.

Collaboration with Development Teams and SRE Evolution

Collaborate with cross-functional engineering teams sharing insights from monitoring systems metrics and customer interactions to contribute to improving system design architecture and reliability.

Evangelize and introduce SRE principles methodologies and best practices (e.g. High Availability frameworks service mesh container orchestration).

Contribute directly to improving logging reporting and alerting capabilities within the SaaS platform.

Operational Security & Continuous Improvement

Ensure security awareness across operational tasks by integrating security-as-code and configuration management principles into workflows.

Drive continuous service improvement by analyzing patterns of incidents/service disruptions and strategizing immediate and long-term fixes.

Qualifications

Bachelors degree in Computer Science Information Technology or a related field ordemonstrableequivalent experience.

Technical Experience:

1-3 years ina technicalsupport system administration or cloud operations role.

Foundational knowledge of public cloud environments (e.g. AWS Google Cloud OpenStack).

Proficiencyin scripting languages such as Python Bash and familiarity with Infrastructure as Code tools (e.g. Terraform).

Solid understanding of web technologies protocols and APIs (e.g. HTTP REST JSON).

Basic familiarity with networking databases (e.g. PostgreSQL) and Linux server administration.

Soft Skills:

Strong analytical and problem-solving skills with comfort in troubleshooting under pressure.

Excellent communication skills able to collaborate effectively across teams and engage with customers professionally and empathetically.

Preferred

Direct experience with Kubernetes or container orchestration systems.

Knowledge of monitoring tools such as Prometheus Grafana or equivalent observability stacks.

Previousexposure to Site Reliability Engineering principles such as SLO management automated delivery pipelines or fault-tolerant architectures.

Familiarity with configuration management tools such as Ansible Chef or Puppet.

What Success Looks Like

The successful candidate will embodySRE fundamental principlesanddemonstrate:

System Reliability Expertise:Navigate maintain and scale SaaS applications by identifying optimal fixes to complex issues in production systems.

Observability Innovator:Drive advancements in monitoring metric collection logging and tracing tools to achieve deeper insights into system behaviors and improve reliability.

Future-Focused Problem Solver:Strategically balance immediate tactical solutions with long-term architectural improvements reducing operational toil and increasing scalability.

Collaborative Leadership:Demonstratea collaborative mindset foster innovation and mentor peers to champion automation initiatives and SRE best practices.

The position will also require participation in an on-call rotation for out-of-hoursincident response.

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However the description may not be all-inclusive and responsibilities and requirements are subject to change.

Please note that F5 only contacts candidates through F5 email address (ending with @) or auto email notification from Workday (ending with or @).

Equal Employment Opportunity

It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race religion color national origin sex sexual orientation gender identity or expression age sensory physical or mental disability marital status veteran or military status genetic information or any other classification protected by applicable local state or federal laws. This policy applies to all aspects of employment including but not limited to hiring job assignment compensation promotion benefits training discipline and termination. F5 offers a variety of reasonable accommodations for candidates. Requesting an accommodation is completely voluntary. F5 will assess the need for accommodations in the application process separately from those that may be needed to perform the job. Request by contacting .

Required Experience: