Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailWe are seeking a highly skilled and experienced Senior DevOps Engineer with a primary focus on Monitoring and Observability to drive the continuous improvement of our security-focused SaaS platform. In this role you will work alongside engineering security and operations teams to ensure that our systems are secure scalable and always up and running. Your efforts will directly impact the performance uptime and security of our offerings for clients around the world.
Key Responsibilities:
Monitoring & Observability: Design implement and maintain sophisticated monitoring alerting and logging solutions to ensure the reliability availability and performance of our security-focused SaaS platform. Use tools like Prometheus Grafana Datadog to provide deep visibility into system health security metrics and application performance.
Incident Management: Respond to and mitigate incidents in real time ensuring minimal impact on customers. Drive post-mortems and root cause analyses (RCAs) to improve monitoring and response processes.
System Reliability: Collaborate with cross-functional teams to define and implement Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for both security and performance metrics.
Automation & CI/CD Integration: Build automated monitoring and alerting pipelines that integrate seamlessly with CI/CD workflows to catch issues early in development testing and production environments.
Mentorship & Best Practices: Provide guidance and mentorship to junior DevOps engineers helping them adopt best practices for monitoring observability and security.
Optimization & Continuous Improvement: Continuously evaluate and refine monitoring tools and practices to adapt to new threats technologies and regulatory requirements.
Required Qualifications:
5 years of experience in DevOps Site Reliability Engineering or Infrastructure roles ideally in cybersecurity or SaaS environments.
Strong experience with monitoring tools like Prometheus Grafana Datadog ELK Splunk or similar observability solutions.
Expertise in Linux/Unix-based systems and cloud environments (AWS GCP Azure).
Proficiency in scripting languages such as Python Bash or Go to automate monitoring tasks and create custom solutions.
Deep understanding of security principles and experience integrating security monitoring into DevOps practices (e.g. SIEM systems threat detection).
Experience with containerization (Docker) and orchestration (Kubernetes) to monitor containerized applications in production.
Familiarity with Infrastructure-as-Code (IaC) tools like Terraform Ansible or CloudFormation to automate infrastructure monitoring setup.
Solid problem-solving skills a keen eye for detail and a proactive approach to system monitoring and incident response.
Preferred Qualifications:
Experience in cybersecurity or working on security monitoring solutions.
Experience with performance monitoring and APM (Application Performance Monitoring) tools.
Background in a software engineering discipline or security engineering.
Certifications in relevant fields such as AWS Certified Solutions Architect Certified Kubernetes Administrator (CKA) Certified DevOps Engineer (or similar).
Required Experience:
Senior IC
Full-Time