Realize your potential by joining the leading performancedriven advertising company!
As a Site Reliability Engineer infra on our Infrastructure team at the TLV office you will play a key role in ensuring the reliability scalability and performance of our critical systems. You will be responsible for managing and improving our core infrastructure with a focus on automation monitoring and incident response. You will work with a wide range of technologies including Kubernetes monitoring and observability tools configuration management systems and core networking services.
To thrive in this role youll need:
- 5 years of experience in a Site Reliability Engineering Systems Engineering or similar role.
- Deep understanding of Site Reliability Engineering principles and practices.
- Extensive experience with Kubernetes including deployment management and troubleshooting.
- Strong experience with monitoring and observability tools such as SensuGo Zabbix VictoriaMetrics Prometheus and ELK.
- Proficiency in configuration management tools such as Puppet and Ansible.
- Solid understanding of Linux internals and networking.
- Experience with managing and maintaining core services such as DNS and networking.
- Strong programming skills in Python and/or Go.
- Experience with both onpremises and cloud environments.
- Experience with KubeVirt.
- Excellent troubleshooting and problemsolving skills.
- Strong communication and collaboration skills.
- Ability to work in a fastpaced dynamic environment.
- Ability to participate in oncall rotations including weekends.
Preferred Qualifications:
- Experience with largescale distributed systems.
- Experience with other cloud providers (e.g. AWS Azure GCP).
- Contributions to opensource projects.
How youll make an impact:
As aSite Reliability Engineer youll bring value by:
- Ensure the reliability availability and performance of our infrastructure services.
- Manage and maintain our Kubernetes infrastructure including KubeVirt.
- Design implement and maintain our monitoring and observability stack (SensuGo VictoriaMetrics Prometheus ELK).
- Automate infrastructure provisioning configuration and deployment processes using Puppet and Ansible.
- Manage and maintain core services such as DNS and networking.
- Troubleshoot and resolve complex infrastructure issues in a timely and efficient manner.
- Participate in oncall rotations and incident response.
- Develop and maintain infrastructureascode (IaC).
- Identify and implement proactive measures to prevent incidents and improve system reliability.
- Collaborate with development teams to ensure smooth and reliable deployments.
- Contribute to the design and implementation of new infrastructure solutions.
- Drive improvements in system architecture processes and tools.
- Mentor and coach other team members.
Why Taboola
If you ask Taboolars what they love about working here theyll tell you that theyve been empowered to realize their full potential while growing and learning from and with smart and talented people. Theyll also share more about:
- Adam Singolda Taboola Founder and CEO says; You can copy anything from another business but you cant copy a companys culture.
- Wellbeing: Enjoy comprehensive benefits (health 401k etc.) a fully stocked kitchen and locationspecific perks (gym partnerships parking).
Flexibility: We offer a hybrid work schedule with 3 days inoffice with an option to come in more often if desired.Work with some of the biggest names: We work with some of the biggest names in the business. Our publisher partners include Yahoo Conde Nast Fox Sports NBCU ESPN CBS and E! Online. Our advertiser clients include Wells Fargo Honda Pinterest Expedia and Honda.
Ready to realize your potential