Founded in 2017 Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happensplatforms like Microsoft 365 Salesforce and hundreds more.
Backed by top investors including Greylock Norwest Venture Partners and IVP weve built a complete SaaS security platform to reduce risk detect and respond to threats and prevent breaches at the source. Our team includes leaders who helped define the categories of endpoint and identity security at CrowdStrike Okta Cylance and Carbon Black.
Now were transforming how SaaS is securedin the era of agentic AI.
Today Obsidian is trusted by global enterprises like Snowflake T-Mobile and Pure Storage. We protect more than 200 organizations across North America Europe the Middle East Southeast Asia Australia and New Zealandincluding many of the worlds largest Fortune 1000 and Global 2000 companies.
With strong global momentum a growing partner ecosystem including SentinelOne Databricks and Google Cloud and a major fundraise on the horizon were scaling quickly toward long-term growth and IPO readiness. Join us as we define the future of SaaS security!
Sr. Site Reliability Engineer (SRE) Obsidian
At Obsidian our Sr. Site Reliability Engineers ensure the reliability scalability and operational excellence of a complex multi-tenant SaaS platform serving enterprise and financial customers. As an SRE you will work closely with DevOps Platform Engineering and product teams to improve system observability incident response and service resilience across the platform.
This is a hands-on engineering role focused on building operational excellence through monitoring automation debugging and continuous improvement. You will help ensure that issues are detected and addressed quickly while contributing to systems that improve platform reliability at scale.
Key Responsibilities
Reliability Engineering: Improve the reliability availability and resiliency of Obsidians production systems and distributed services
Detection & Observability: Build and maintain monitoring alerting dashboards and observability tooling to enhance system visibility and reduce operational noise
Incident Response & Operations: Support incident response on-call operations troubleshooting and postmortem processes to drive operational excellence
Collaboration: Partner with engineering teams to implement SLI/SLO practices operational standards and reliability-focused workflows
Execution: Automate infrastructure operations deployment workflows and platform tooling across Kubernetes cloud infrastructure and data pipelines
Required Qualifications
3-6 years of experience in Site Reliability Engineering DevOps Production Engineering or related roles
Experience operating and supporting production systems in AWS and/or GCP
Familiarity with Kubernetes and Helm in cloud-native environments
Experience with observability and monitoring tools such as Prometheus Grafana Datadog or similar platforms
Exposure to CI/CD systems such as GitLab CI/CD GitHub Actions ArgoCD or equivalent
Strong troubleshooting and debugging skills across distributed systems and microservices
Experience writing automation or infrastructure tooling using scripting or programming languages
Strong systems thinking and a collaborative engineering mindset
Preferred Qualifications
AI Agent development experience
Experience supporting SaaS platforms in production environments
Familiarity with incident management and postmortem practices
Exposure to infrastructure-as-code and GitOps workflows
Understanding of SLI/SLO concepts and operational metrics
Experience with enterprise-scale monitoring or customer-facing production systems
Why This Role
Work on reliability challenges across a large-scale distributed SaaS platform
Build and improve observability and operational tooling used across engineering
Gain hands-on experience with cloud infrastructure Kubernetes and production systems
Help safeguard critical services for enterprise and financial customers
What Success Looks Like
Production issues are detected and resolved quickly
Monitoring and alerting provide clear actionable operational insights
Reliability metrics and operational practices improve over time
Engineering teams can effectively troubleshoot and self-serve observability
Automation reduces operational toil and improves platform stability
Employee Benefits
Our competitive benefits packages are designed to support our employees well-being both at work and at home. Our US based employees enjoy:
Competitive compensation with equity and 401k
Comprehensive healthcare with dental and vision coverage
Flexible paid time off and paid holiday time off
12 weeks of new parent or family leave
Personal and professional development resources
For more details on our US benefits or for information on our international benefits please see here.
Pay Transparancy
Please note that the base pay range is a guideline and for candidates who receive an offer the base pay will vary based on factors such as work location as well as the knowledge skills and experience of the addition to a competitive base salary this position is eligible for equity awards and may be eligible for sales commission or incentive compensation based on the role or function within the company.
At Obsidian we are proud to be an equal-opportunity employer. We value diversity and hire for talent passion and compliance with federal law all persons hired will be required to submit satisfactory proof of identity and legal authorization. If you have a need that requires accommodation please contact
Information collected and processed as part of any job applications you choose to submit is subject to Obsidians Applicant Privacy Policy.
Base Salary Range
95000 - 117000 GBP
Required Experience:
Senior IC
Founded in 2017 Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happensplatforms like Microsoft 365 Salesforce and hundreds more.Backed by top investors including Greylock Norwest Venture Partners and IVP weve built a complete SaaS security...
Founded in 2017 Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happensplatforms like Microsoft 365 Salesforce and hundreds more.
Backed by top investors including Greylock Norwest Venture Partners and IVP weve built a complete SaaS security platform to reduce risk detect and respond to threats and prevent breaches at the source. Our team includes leaders who helped define the categories of endpoint and identity security at CrowdStrike Okta Cylance and Carbon Black.
Now were transforming how SaaS is securedin the era of agentic AI.
Today Obsidian is trusted by global enterprises like Snowflake T-Mobile and Pure Storage. We protect more than 200 organizations across North America Europe the Middle East Southeast Asia Australia and New Zealandincluding many of the worlds largest Fortune 1000 and Global 2000 companies.
With strong global momentum a growing partner ecosystem including SentinelOne Databricks and Google Cloud and a major fundraise on the horizon were scaling quickly toward long-term growth and IPO readiness. Join us as we define the future of SaaS security!
Sr. Site Reliability Engineer (SRE) Obsidian
At Obsidian our Sr. Site Reliability Engineers ensure the reliability scalability and operational excellence of a complex multi-tenant SaaS platform serving enterprise and financial customers. As an SRE you will work closely with DevOps Platform Engineering and product teams to improve system observability incident response and service resilience across the platform.
This is a hands-on engineering role focused on building operational excellence through monitoring automation debugging and continuous improvement. You will help ensure that issues are detected and addressed quickly while contributing to systems that improve platform reliability at scale.
Key Responsibilities
Reliability Engineering: Improve the reliability availability and resiliency of Obsidians production systems and distributed services
Detection & Observability: Build and maintain monitoring alerting dashboards and observability tooling to enhance system visibility and reduce operational noise
Incident Response & Operations: Support incident response on-call operations troubleshooting and postmortem processes to drive operational excellence
Collaboration: Partner with engineering teams to implement SLI/SLO practices operational standards and reliability-focused workflows
Execution: Automate infrastructure operations deployment workflows and platform tooling across Kubernetes cloud infrastructure and data pipelines
Required Qualifications
3-6 years of experience in Site Reliability Engineering DevOps Production Engineering or related roles
Experience operating and supporting production systems in AWS and/or GCP
Familiarity with Kubernetes and Helm in cloud-native environments
Experience with observability and monitoring tools such as Prometheus Grafana Datadog or similar platforms
Exposure to CI/CD systems such as GitLab CI/CD GitHub Actions ArgoCD or equivalent
Strong troubleshooting and debugging skills across distributed systems and microservices
Experience writing automation or infrastructure tooling using scripting or programming languages
Strong systems thinking and a collaborative engineering mindset
Preferred Qualifications
AI Agent development experience
Experience supporting SaaS platforms in production environments
Familiarity with incident management and postmortem practices
Exposure to infrastructure-as-code and GitOps workflows
Understanding of SLI/SLO concepts and operational metrics
Experience with enterprise-scale monitoring or customer-facing production systems
Why This Role
Work on reliability challenges across a large-scale distributed SaaS platform
Build and improve observability and operational tooling used across engineering
Gain hands-on experience with cloud infrastructure Kubernetes and production systems
Help safeguard critical services for enterprise and financial customers
What Success Looks Like
Production issues are detected and resolved quickly
Monitoring and alerting provide clear actionable operational insights
Reliability metrics and operational practices improve over time
Engineering teams can effectively troubleshoot and self-serve observability
Automation reduces operational toil and improves platform stability
Employee Benefits
Our competitive benefits packages are designed to support our employees well-being both at work and at home. Our US based employees enjoy:
Competitive compensation with equity and 401k
Comprehensive healthcare with dental and vision coverage
Flexible paid time off and paid holiday time off
12 weeks of new parent or family leave
Personal and professional development resources
For more details on our US benefits or for information on our international benefits please see here.
Pay Transparancy
Please note that the base pay range is a guideline and for candidates who receive an offer the base pay will vary based on factors such as work location as well as the knowledge skills and experience of the addition to a competitive base salary this position is eligible for equity awards and may be eligible for sales commission or incentive compensation based on the role or function within the company.
At Obsidian we are proud to be an equal-opportunity employer. We value diversity and hire for talent passion and compliance with federal law all persons hired will be required to submit satisfactory proof of identity and legal authorization. If you have a need that requires accommodation please contact
Information collected and processed as part of any job applications you choose to submit is subject to Obsidians Applicant Privacy Policy.