Principal Site Reliability Engineer (TDP)

Palo Alto Networks

Not Interested
Bookmark
Report This Job

profile Job Location:

Santa Clara County, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 8 days ago
Vacancies: 1 Vacancy

Department:

Engineering

Job Summary

Your Career

Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Principle Site Reliability Engineer for the TDP team you will be part of a team supporting the services running on this infrastructure. This includes automation architecture performance observability troubleshooting security and reliability.

Our Infrastructure Platform stack includes Terraform Kubernetes GitLab CI/CD GitOps Prometheus Grafana Loki Docker GCP ESO Kafka Neo4j Spanner MongoDB Cassandra BigQuery Redshift MySQL Python Bash and Go.  

Your Impact

  • Contribute to the success of SRE and DevOps

  • Develop expertise in new technologies

  • Work with developers researchers data scientists and security experts

  • Design build and operate reliable secure Cloud infrastructure

  • Ensure that applications are production-ready scalable and reliable

  • Develop tools and automation frameworks

  • Automate robust deployment of robust services

  • Orchestrate end-to-end monitoring and alerting

  • Participate with SRE and Dev teams in the on-call rotation

  • Lead root cause analysis of critical business and production issues

  • Design implement and maintain the companys database systems to ensure optimal performance availability and stability.

  • Safeguard sensitive data by implementing and managing robust security measures.

  • Develop and manage reliable backup and recovery strategies to prevent data loss and ensure business continuity.

  • Collaborate with development and IT teams to support applications and infrastructure that rely on the databases.

  • Proactively monitor database performance to identify and resolve bottlenecks slow queries and resource contention issues.

  • Optimize complex SQL queries stored procedures and database configurations (tuning).

  • Manage and optimize database objects including tables indexes and schemas to improve efficiency and responsiveness

  • Design implement and manage comprehensive backup and recovery procedures.

  • Perform regular testing of backups and restore procedures to ensure data can be recovered swiftly and accurately in a disaster scenario.

  • Develop and maintain disaster recovery plans and execute them during system outages.


Qualifications :

Your Experience

  • 6 years as an engineer in Infrastructure Operations SRE DevOps or System Engineering

  • 4 years building high availability scalable cloud-native applications on AWS and GCP

  • BS or MS in Computer Science a related field or equivalent professional experience or equivalent military experience required

  • Expert proficiency in SQL and GraphQL.

  • Deep working knowledge of at least one major relational database platform (e.g. Neo4j Spanner MySQL PostgreSQL AlloyDB).

  • Experience with database design data modeling and data warehousing concepts.

  • Strong understanding of backup recovery performance monitoring and tuning techniques.

  • Expertise in configuration management with a framework such as Ansible Terraform Helm

  • Passion for infrastructure and monitoring as code

  • Solid experience in container workloads and Kubernetes

  • Familiarity with PKI concepts Networking concepts

  • In-depth knowledge of different security controls ( app-id user-id security profile url category content ssl decryption firewall MFA etc)

  • Linux administration internals and network troubleshooting

  • Proficiency with programming languages like Golang or Python along with shell scripting to automate tasks.

  • Proficiency with CI/CD pipelines ArgoCD and GitLab CI/CD. 

  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions

  • Experience with managing Kafka is a plus

  • Excellent written and verbal communication able to collaborate and rally support

  • Self-disciplined self-managed self-motivated strong sense of ownership urgency and drive. 

  • Ready to understand and dissect new technology stacks quickly

  • Excellent written and verbal communication able to collaborate and rally support

  • Experience with Cloud Database Services (e.g. Amazon RDS Azure SQL Database Google Cloud SQL).

  • Relevant professional certifications (e.g. Oracle Certified Professional (OCP) Microsoft Certified: Azure Database Administrator Associate).

  • Experience with NoSQL databases (e.g. MongoDB Cassandra).

  • Familiarity with data governance and regulatory compliance standards (e.g. GDPR HIPAA)


Additional Information :

The Team

Our engineering team is at the core of our products connected directly to the mission of preventing cyberattacks. We are constantly innovating challenging the way we and the industry think about cybersecurity. Our engineers dont shy away from building products to solve problems no one has pursued before.

We define the industry instead of waiting for directions. We need individuals who feel comfortable in ambiguity excited by the prospect of a challenge and empowered by the unknown risks facing our everyday lives that are only enabled by a secure digital downtime.

Compensation Disclosure

The compensation offered for this position will depend on qualifications experience and work location. For candidates who receive an offer at the posted level the starting base salary (for non-sales roles) or base salary commission target (for sales/commissioned roles) is expected to be between $147000 - $230000/YR. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.

#LI-TD1

Our Commitment

Were problem solvers that take risks and challenge cybersecuritys status quo. Its simple: we cant accomplish our mission without diverse teams innovating together.

We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need please contact us at  .

Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace and all qualified applicants will receive consideration for employment without regard to age ancestry color family or medical care leave gender identity or expression genetic information marital status medical condition national origin physical or mental disability political affiliation protected veteran status race religion sex (including pregnancy) sexual orientation or other legally protected characteristics.

All your information will be kept confidential according to EEO guidelines.

Is role eligible for Immigration Sponsorship: Yes


Remote Work :

No


Employment Type :

Full-time

Your CareerPalo Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Principle Site Reliability Engineer for the TDP team you will be part of a team supporting the services running on this infrastructure. This includes automation architecture performance observabil...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Our enterprise security platform detects and prevents known and unknown threats while safely enabling an increasingly complex and rapidly growing number of applications. Come be part of the team that redefined the firewall industry and is now the fastest-growing security company in hi ... View more

View Profile View Profile