Gradial helps marketers and creatives move from idea to execution faster. Our platform turns intent into action automating website updates design system migrations and ongoing content optimization while preserving brand integrity across every touchpoint.
Backed by leading investors were building software that adapts to the user not the other way around. We move with urgency operate with ownership and solve hard problems from first principles. If you want to do ambitious work take real responsibility and help define the future of AI-native content operations youll do your best work here.
The Role
As a Principal Site Reliability Engineer at Gradial you will shape the foundation our platform runs on as we scale. You will work closely with the CTO and engineering team to make our systems faster more resilient and easier to operate in a high-growth environment. This is a hands-on IC leadership role for someone who wants real ownership high leverage and the chance to define how reliability looks at an AI-native company.
What Youll Own
- Own the reliability scalability and operational health of Gradials production platform.
- Lead the evolution of Kubernetes CI/CD observability and infrastructure as code across the stack.
- Set the standard for how we design ship and operate reliable systems.
- Build the tooling and automation that help engineers move faster with more confidence.
- Drive improvements in monitoring alerting incident response and service readiness.
- Partner with engineering to spot scaling risks early and solve them before they slow us down.
- Influence the long-term direction of our platform across reliability security performance and cost.
What Were Looking For
- 5 years of experience in SRE DevOps platform engineering or infrastructure roles with direct ownership of production systems.
- Proven success designing and operating production-grade infrastructure in fast-moving high-growth environments.
- Deep expertise in Kubernetes cloud-native architecture and container orchestration.
- Strong experience with infrastructure as code GitOps CI/CD workflows and modern deployment practices.
- Strong command of observability and reliability fundamentals across metrics logging tracing alerting and incident response.
- A track record of leading through influence making sound technical decisions and raising the bar across engineering teams.
Nice to Have
- Familiarity with AI or ML infrastructure including GPU provisioning model deployment or compute-intensive workloads.
- Experience supporting cloud or multi-cloud environments with a focus on resilience and scale.
- Comfort with TypeScript or Python for internal tooling and operational automation.
The salary range for this position is $180000 $240000 annually. Final compensation will be determined based on factors such as experience skills and addition to base salary this role may be eligible for performance-based bonuses and equity awards. Gradial offers a comprehensive benefits package including medical dental & vision insurance 401K retirement plan paid time off paid sick leave and other employee wellness programs.
Youll thrive here if you...
- Embrace AI as a core tool for problem-solving creativity and scale.
- Show a strong work ethic high ownership and bias toward action.
- Communicate with clarity and curiosity.
- Thrive in fast-paced hyper-growth environments; where building is always better than maintaining the status quo.
What we offer
- Meaningful equity and competitive salary
- Comprehensive health dental and vision coverage
- Fast-paced environment with autonomy and ownership
- Real impact zero bureaucracy
- A front-row seat to building category-defining AI infrastructure
AI Literacy & Interviewing Tools
As an AI-first company we prioritize AI literacy as a core competency in our hiring decisions. Were excited by candidates who thoughtfully apply AI tools in their work but during interviews were focused on you. This is your opportunity to show how you think communicate and solve problems. Over-reliance on AI-generated responses during the interview process (especially when it obscures your own voice) will result in disqualification. We want to understand your unique perspective and how you approach challenges both with and without AI.
Privacy Policy
By submitting your application to Gradial you acknowledge that any personal data you provide will be processed in accordance with our. This includes the collection use and storage of your information for the purposes of evaluating your qualifications and communicating with you about your candidacy. We handle applicant data with care and in compliance with applicable data protection laws.
If you have any questions about how your information is used please refer to ouror contact us directly.
Applicants who require reasonable accommodation to participate in the application or interview process should contact us at to request assistance.
Required Experience:
Staff IC
Gradial helps marketers and creatives move from idea to execution faster. Our platform turns intent into action automating website updates design system migrations and ongoing content optimization while preserving brand integrity across every touchpoint.Backed by leading investors were building soft...
Gradial helps marketers and creatives move from idea to execution faster. Our platform turns intent into action automating website updates design system migrations and ongoing content optimization while preserving brand integrity across every touchpoint.
Backed by leading investors were building software that adapts to the user not the other way around. We move with urgency operate with ownership and solve hard problems from first principles. If you want to do ambitious work take real responsibility and help define the future of AI-native content operations youll do your best work here.
The Role
As a Principal Site Reliability Engineer at Gradial you will shape the foundation our platform runs on as we scale. You will work closely with the CTO and engineering team to make our systems faster more resilient and easier to operate in a high-growth environment. This is a hands-on IC leadership role for someone who wants real ownership high leverage and the chance to define how reliability looks at an AI-native company.
What Youll Own
- Own the reliability scalability and operational health of Gradials production platform.
- Lead the evolution of Kubernetes CI/CD observability and infrastructure as code across the stack.
- Set the standard for how we design ship and operate reliable systems.
- Build the tooling and automation that help engineers move faster with more confidence.
- Drive improvements in monitoring alerting incident response and service readiness.
- Partner with engineering to spot scaling risks early and solve them before they slow us down.
- Influence the long-term direction of our platform across reliability security performance and cost.
What Were Looking For
- 5 years of experience in SRE DevOps platform engineering or infrastructure roles with direct ownership of production systems.
- Proven success designing and operating production-grade infrastructure in fast-moving high-growth environments.
- Deep expertise in Kubernetes cloud-native architecture and container orchestration.
- Strong experience with infrastructure as code GitOps CI/CD workflows and modern deployment practices.
- Strong command of observability and reliability fundamentals across metrics logging tracing alerting and incident response.
- A track record of leading through influence making sound technical decisions and raising the bar across engineering teams.
Nice to Have
- Familiarity with AI or ML infrastructure including GPU provisioning model deployment or compute-intensive workloads.
- Experience supporting cloud or multi-cloud environments with a focus on resilience and scale.
- Comfort with TypeScript or Python for internal tooling and operational automation.
The salary range for this position is $180000 $240000 annually. Final compensation will be determined based on factors such as experience skills and addition to base salary this role may be eligible for performance-based bonuses and equity awards. Gradial offers a comprehensive benefits package including medical dental & vision insurance 401K retirement plan paid time off paid sick leave and other employee wellness programs.
Youll thrive here if you...
- Embrace AI as a core tool for problem-solving creativity and scale.
- Show a strong work ethic high ownership and bias toward action.
- Communicate with clarity and curiosity.
- Thrive in fast-paced hyper-growth environments; where building is always better than maintaining the status quo.
What we offer
- Meaningful equity and competitive salary
- Comprehensive health dental and vision coverage
- Fast-paced environment with autonomy and ownership
- Real impact zero bureaucracy
- A front-row seat to building category-defining AI infrastructure
AI Literacy & Interviewing Tools
As an AI-first company we prioritize AI literacy as a core competency in our hiring decisions. Were excited by candidates who thoughtfully apply AI tools in their work but during interviews were focused on you. This is your opportunity to show how you think communicate and solve problems. Over-reliance on AI-generated responses during the interview process (especially when it obscures your own voice) will result in disqualification. We want to understand your unique perspective and how you approach challenges both with and without AI.
Privacy Policy
By submitting your application to Gradial you acknowledge that any personal data you provide will be processed in accordance with our. This includes the collection use and storage of your information for the purposes of evaluating your qualifications and communicating with you about your candidacy. We handle applicant data with care and in compliance with applicable data protection laws.
If you have any questions about how your information is used please refer to ouror contact us directly.
Applicants who require reasonable accommodation to participate in the application or interview process should contact us at to request assistance.
Required Experience:
Staff IC
View more
View less