Senior SRE, Infrastructure & Platform

F5 Networks

Not Interested
Bookmark
Report This Job

profile Job Location:

Singapore - Singapore

profile Monthly Salary: Not Disclosed
Posted on: 10 hours ago
Vacancies: 1 Vacancy

Job Summary

At F5 we strive to bring a better digital world to life. Our teams empower organizations across the globe to create secure and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity from protecting consumers from fraud to enabling companies to focus on innovation.

Everything we do centers around people. That means we obsess over how to make the lives of our customers and their customers better. And it means we prioritize a diverse F5 community where each individual can thrive.

F5 is bringing a better digital world to life by helping organizations create secure and run applications that power our lives. Within the Platform Engineering team this role helps ensure our platform isoperatedsafely reliably and with operational excellence.

We are looking for a Senior Site Reliability Engineerthat leads withkindness andpossessesa strong software development background to join our Infrastructure Engineering team. Your primary focus will be building automation tooling and internal platforms that enable our team tooperatea global multi-datacenter infrastructure spanninga growing number ofPoints of Presenceacross the globe.

Deep familiarity with production infrastructure -- bare-metal hypervisors containerized workloads Kubernetes clusters and cloud platforms -- is essential but your primary lens shouldbe onautomation and will develop internal tools and APIs in Python and GodesignandmaintainAnsible automation across hundreds of hosts build CI/CD pipelines and create self-service interfaces that reduce toil andeliminatemanual operations.

You will work within a PCI-DSS compliant environment andparticipatein a 24x7 on-call rotation.

What Youll Do

Internal Tooling & Application Development

  • Design and develop internal tools CLIs and APIs (primarily in Go and Python) that enable infrastructure self-service automate complex workflows and improve operational efficiency
  • Build integrations between infrastructure systems -- connecting CMDB/IPAM (NetBox) secrets management (HashiCorpVault) hypervisor APIs (Proxmox) monitoring platforms and CI/CD pipelines into cohesive automated workflows
  • Develop andmaintainAPI clients and libraries for interacting with infrastructure services (ProxmoxAPI Vault APINetBoxAPIiLORedfish container registries)
  • Write well-tested documented and maintainable code with proper versioning release processes and code review practices

Infrastructure as Code & Ansible Development

  • Architect develop and refactor Ansible roles and playbooks across a large-scale inventory spanning 30 datacenters 80 group variable files and 40 roles
  • Design reusable composable Ansible role patterns that scale cleanly as the DC footprint grows -- new DCs should be deployable with minimal variable additions
  • Improve idempotency error handling and test coverage across the existing Ansible codebase
  • Develop custom Ansible modules plugins and lookup plugins where upstream modulesmay be insufficient (e.g. custom Vault integrationProxmoxAPI interactionsiLOautomation)
  • Automate bare-metal server lifecycle end-to-end: fromiLObootstrap through OS installation hypervisor configuration VM provisioning and service deployment

CI/CD Pipeline Engineering

  • Design write and maintain GitLab CI pipelines for infrastructure automation including multi-stage deployment workflows with linting validation canary testing and regional rollout
  • Build pipeline patterns for safe infrastructure changes: staged rollouts automated rollback drift detection and change validation
  • Create reusable pipeline templates and shared CI components thatstandardisehow infrastructure changes are tested and deployed
  • Implement automated testing for Ansible roles and infrastructure changes(molecule ansible-lint integration testing in ephemeral environments)

Kubernetes & Container Platform Automation

  • Develop automation for self-hosted Kubernetes cluster lifecycle management: provisioning upgrades scaling and disaster recovery
  • Build andmaintaincontainer image build pipelines registry management and image promotion workflows
  • Create Kubernetes operators or controllers (in Go) where custom automation of cluster-level concerns is needed
  • Automate workload deployment patterns including Helm chart development andGitOpsworkflows

Cloud Infrastructure Automation

  • DevelopIaCand automation for AWS and Azure resources integrating cloud infrastructure with on-premises systems
  • Build automation that spans hybrid environments -- coordinating deployments across bare-metalvirtualized and cloud targets from a unified pipeline

Observability & Reliability Engineering

  • Instrument internal tools and automation with proper logging metrics and tracing
  • Build automated remediation workflows that respond to monitoring alerts and reduce mean time to recovery
  • Develop reporting and dashboards that provide visibility into infrastructure state automation success rates and toil metrics
  • Identifyand automate away recurring operational toil; track and quantify toil reduction over time

Security & Compliance Automation

  • Automate PCI-DSS compliance workflows including CIS benchmark hardening audit evidence collection and configuration drift detection
  • Build automated secret rotation pipelines usingHashiCorpVault
  • Develop security scanning integration into CI/CD pipelines (container image scanning infrastructure configuration validation)

What Were Looking For

  • 5 years of experience in an SRE DevOps or Infrastructure Engineering role with a strong emphasis on writing code and building automation
  • Proficiencyin Python with experience building CLI tools APIs (Flask/FastAPIor equivalent) and automation frameworks
  • Expert-level Ansible skills: custom role development module/plugin authorship complex Jinja2 templating inventory management at scale and CI/CD integration
  • Solid Linux systems knowledge (RHEL/CentOS) -- you need to understand the systemsyoureautomating at a depth that lets you debug failures and design robust automation
  • Experience building andmaintainingCI/CD pipelines (GitLab CI preferred) for infrastructure automation not just application builds
  • Production experience with self-hosted Kubernetes: cluster operations controller/operator development and workload automation
  • Practical AWS and Azure experience with anIaCmindset -- provisioning and managing cloud resources through automation not console clicks
  • Experience with API-driven infrastructure management (RESTful APIs Redfish/iLO hypervisor APIs)
  • Familiarity withHashiCorpVault or equivalent secrets management platforms including programmatic integration
  • Understanding of PCI-DSS requirements as they apply to automated infrastructure management -- audit trails change control hardening automation
  • Strong software engineering fundamentals: version control workflows code review testing practices documentation and release management

Preferred

  • Experience withProxmoxVE API automation or similar hypervisor platform APIs (VMware vSpherelibvirt)
  • Familiarity with bare-metal server management automation (HPEiLORedfish API IPMI or equivalent)
  • Workingproficiencyin Go with experience building at least one of: CLI tools APIs Kubernetes operators/controllers or systems-level tooling
  • Experience building custom Ansible modules or plugins in Python
  • Familiarity withNetBox(or similar CMDB/IPAM) API integration for inventory-driven automation
  • Experience developing Kubernetes operators using operator-sdkkubebuilder or controller-runtime
  • Background in network automation (DNS management load balancer configuration LDAP/directory services)
  • Experienceoperatingin colocation / carrier-neutral DC environments (Equinix Interxion or similar)
  • Contributions to open-source infrastructure tooling or libraries

What Youll Need to Succeed

  • A software engineering mindset applied to infrastructure problems -- you think in terms of abstractions interfaces testability and maintainability not just getting it working
  • Strong opinions on code quality but pragmatism about when to ship -- this is infrastructure tooling not a SaaS product and the right trade-offs are different
  • The ability to understand complex existing systems deeply enough to automate them safely -- our Ansible codebase has evolved overyearsand new automation must integrate cleanly
  • Comfort working autonomously in a globally distributed remote-first team across multiple time zones
  • Clear written communication -- you will write design documents READMEs and runbooks as a natural part of your development workflow
  • Willingness toparticipatein a 24x7 on-call rotation; your on-call experience will directly inform what you build next

Nice to Have

  • Experience withOSTree/ image-based OS lifecycle automation
  • Familiarity with Pulp or on-premises package repository management automation
  • Experience building developer portals or self-service infrastructure platforms (Backstage or similar)
  • Background in DDoS mitigation automation or network-functionvirtualization

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However the description may not be all-inclusive and responsibilities and requirements are subject to change.

Please note that F5 only contacts candidates through F5 email address (ending with @) or auto email notification from Workday (ending with or @).

Equal Employment Opportunity

It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race religion color national origin sex sexual orientation gender identity or expression age sensory physical or mental disability marital status veteran or military status genetic information or any other classification protected by applicable local state or federal laws. This policy applies to all aspects of employment including but not limited to hiring job assignment compensation promotion benefits training discipline and termination. F5 offers a variety of reasonable accommodations for candidates. Requesting an accommodation is completely voluntary. F5 will assess the need for accommodations in the application process separately from those that may be needed to perform the job. Request by contacting .


Required Experience:

Senior IC

At F5 we strive to bring a better digital world to life. Our teams empower organizations across the globe to create secure and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity from protecting consumers from fraud to enabling companies ...
View more view more

About Company

Company Logo

F5 application services ensure that applications are always secure and perform the way they should—in any environment and on any device.

View Profile View Profile