Sr. Software Engineer, Full Stack

F5 Networks

Job Location:

San Jose, CA - USA

Monthly Salary: $ 176800 - 265200

Posted on: 30+ days ago

Vacancies: 1 Vacancy

The job posting is outdated and position may be filled

Job Summary

At F5 we strive to bring a better digital world to life. Our teams empower organizations across the globe to create secure and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity from protecting consumers from fraud to enabling companies to focus on innovation.

Everything we do centers around people. That means we obsess over how to make the lives of our customers and their customers better. And it means we prioritize a diverse F5 community where each individual can thrive.

Position Summary: Were seeking a Senior Software Engineer specializing in Cloud SRE and Automation with strong expertise in building reliable scalable cloud infrastructure implementing runbook automation and driving operational excellence through intelligent automation and observability.

What Youll Do:

Build and Scale Cloud Infrastructure with Reliability

Design develop and implement cloud-native automation and remediation services across AWS Azure and GCP platforms
Build and maintain highly available scalable infrastructure using Infrastructure as Code (Terraform CloudFormation ARM Templates)
Develop and optimize cloud architectures for reliability performance and cost efficiency
Implement and manage Kubernetes-based containerized workloads across multi-cloud environments
Design and build self-healing systems with automated remediation and closed-loop automation
Create and maintain observability pipelines monitoring solutions and alerting systems across cloud platforms
Develop cloud-native CI/CD pipelines for rapid reliable application and infrastructure deployments

Drive SRE Excellence and Automation

Apply Site Reliability Engineering principles including SLIs SLOs SLAs and error budgets to cloud services
Design and implement runbook automation frameworks for incident response and operational tasks
Build automation tools and scripts to reduce toil and improve operational efficiency
Develop integration layers with ITSM platforms incident management systems and monitoring tools (ServiceNow PagerDuty Jira)
Implement chaos engineering and resilience testing to validate system reliability
Perform capacity planning performance tuning and cost optimization for cloud resources
Participate in incident response on-call rotations and conduct blameless postmortems

Monitor Observe and Optimize

Implement comprehensive observability solutions using Prometheus Grafana OpenTelemetry CloudWatch and other tools
Build automated alerting and intelligent runbook triggering based on system metrics and logs
Develop dashboards and metrics to track system health performance and reliability
Analyze system behavior and implement predictive analytics for proactive issue detection
Optimize application and infrastructure performance across distributed cloud environments

Collaborate and Mentor

Work closely with SREs QA development teams and platform engineers to improve reliability and performance
Mentor junior engineers on SRE best practices cloud architecture and automation development
Participate in code reviews technical design discussions and architecture planning
Contribute to the evolution of SRE culture and practices within the organization
Document automation workflows runbooks cloud architectures and operational procedures

Qualifications:

Must-Have:

Software Engineering & SRE Experience 6-8 years of software development experience with 3 years in SRE DevOps cloud engineering or platform engineering roles
Programming Skills Strong programming proficiency in Python
Cloud Platform Expertise Hands-on experience with multi-cloud environments:
- AWS: EC2 ECS/EKS Lambda CloudWatch CloudFormation Step Functions Systems Manager Auto Scaling VPC IAM S3 RDS/DynamoDB
- Azure: Virtual Machines AKS Azure Functions Azure Monitor ARM Templates Logic Apps Azure Automation Virtual Networks Azure AD
- GCP: Compute Engine GKE Cloud Functions Cloud Monitoring Deployment Manager Cloud Workflows VPC IAM
- Experience with at least two major cloud providers
Kubernetes & Containers Strong experience with:
- Kubernetes architecture deployments services and operations
- Container orchestration (EKS AKS GKE)
- Docker containerd and container image management
- Helm Kustomize for application packaging
SRE Principles & Practices Deep understanding of:
- Site Reliability Engineering methodologies (SLIs SLOs SLAs error budgets)
- Incident management on-call practices and postmortem culture
- Toil reduction and automation strategies
- Capacity planning and performance engineering
- Chaos engineering and resilience testing
Observability & Monitoring Hands-on experience with:
- Prometheus Grafana OpenTelemetry
- Cloud-native monitoring (CloudWatch Azure Monitor Cloud Logging)
- Log aggregation and analysis (ELK Stack Splunk Datadog)
- Distributed tracing and APM tools
- Time-series databases and metrics systems
CI/CD & DevOps Strong experience with:
- CI/CD pipelines (Jenkins GitLab CI/CD GitHub Actions ArgoCD)
- Git-based workflows and version control
- Automated testing and deployment strategies
Communication & Problem-Solving Excellent troubleshooting debugging and analytical skills; strong written and verbal communication abilities

Nice-to-Have:

Advanced Cloud Skills Experience with:
- Multi-cloud architecture and hybrid cloud deployments
- Cloud migration strategies and implementation
- Serverless architectures and event-driven systems
- Cloud cost optimization and FinOps practices
- Cloud security best practices and compliance frameworks
Advanced Kubernetes Experience with:
- Kubernetes operators and custom controllers
- Custom Resource Definitions (CRDs)
- Kubernetes security RBAC and network policies
- Multi-cluster and multi-tenant Kubernetes architectures
AI/ML for Operations Understanding or experience with:
- Machine learning concepts for IT operations
- Anomaly detection and predictive analytics
- Intelligent alerting and automated root cause analysis
- Log analysis using ML techniques
Advanced Observability Experience with:
- eBPF for advanced system observability
- Custom OpenTelemetry collectors and instrumentation
- Advanced APM tools (New Relic Dynatrace AppDynamics)
- Network performance monitoring (ThousandEyes Kentik)
Additional Automation Tools Experience with:
- ChatOps frameworks and integrations
- Incident response automation platforms (Shoreline Resolve)
- Event streaming platforms (Kafka Kinesis Pub/Sub)
Automation & Runbooks Experience with:
- Runbook automation platforms (Rundeck StackStorm Ansible Tower/AWX)
- Workflow orchestration and job scheduling
- Automated remediation and self-healing systems
- Event-driven automation frameworks
Additional Experience:
- Contributions to open-source cloud SRE or automation projects
- Experience with database reliability (MySQL PostgreSQL NoSQL)
- Knowledge of disaster recovery and high availability patterns

Education:

Typically requires a minimum of 10 years of related experience with a bachelors degree; or 3 years and a masters degree
Bachelors degree in Computer Science Information Technology or related field preferred

Environment:

Freedom and Learning: Embrace an environment that fosters freedom continuous learning and ownership
Mentorship: Benefit from great mentors with solid backgrounds in various areas eager to contribute to your professional development
Team Collaboration: Join a great team where you will feel at home from day one contributing to a positive and supportive workplace culture

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However the description may not be all-inclusive and responsibilities and requirements are subject to change.

The annual base pay for this position is: $176800.00 - $265200.00

F5 maintains broad salary ranges for its roles in order to account for variations in knowledge skills experience geographic locations and market conditions as well as to reflect F5s differing products industries and lines of business. The pay range referenced is as of the time of the job posting and is subject to change.

You may also be offered incentive compensation bonus restricted stock units and benefits. More details about F5s benefits can be found at the following link: F5 reserves the right to change or terminate any benefit plan without notice.

Please note that F5 only contacts candidates through F5 email address (ending with @) or auto email notification from Workday (ending with or @).

Equal Employment Opportunity

It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race religion color national origin sex sexual orientation gender identity or expression age sensory physical or mental disability marital status veteran or military status genetic information or any other classification protected by applicable local state or federal laws. This policy applies to all aspects of employment including but not limited to hiring job assignment compensation promotion benefits training discipline and termination. F5 offers a variety of reasonable accommodations for candidates. Requesting an accommodation is completely voluntary. F5 will assess the need for accommodations in the application process separately from those that may be needed to perform the job. Request by contacting .

Required Experience:

Senior IC