Principal Software Engineer, Enterprise Scalability

Boston, NH - USA

Monthly Salary: Not Disclosed

Posted on: 2 days ago

Vacancies: 1 Vacancy

Job Summary

Be Klaviyos senior IC for scale you will report into a VP of Engineering and lead performance reliability multiregion and largetenant readiness. Youll drive platform-wide architectural change hunt bottlenecks and optimize systems and partner across teams to productionize improvements. Given that this is an IC role with no direct reports; you will lead via technical depth handson impact and crisp crossorg alignment.

What Youll Do

Define enterprise scalability fitness functions (latency/throughput/error rates) and a scorecard; align teams to SLOs and budgets.
Design/implement sharding and partitioning strategies caching/backpressure multiregion readiness and highvolume migration paths.
Build lightweight enablement: benchmarks profiling harnesses reproducible testbeds; pair with teams to land fixes.
Lead scalability reviews and readiness gates that acceleratenot blockdelivery; drive incident deep dives tied to systemic fixes.
Communicate clearly to execs and engineers tying technical work to business impact and customer outcomes.
Integrate AI into scale and resiliency workfrom proactive anomaly detection to synthetic load and guided runbooksso performance improvements stick and incidents dont repeat.

Who You Are

Experience: 12 years scaling multitenant SaaS with a reputation for removing major bottlenecks and proving impact with data.
Technical expertise: Performance engineering capacity planning sharding/partitioning caching/backpressure multiregion readiness and highvolume migrations; you turn hotspots into robust patterns.
AI tools & automation: You apply AI to scale workprofiling assistance workload modeling synthetic traffic generation anomaly detection and runbook copilotsalways with explicit guardrails and observability.
Crossorg influence: You align teams through fitness functions scorecards and readiness gates that acceleratenot blockdelivery; you communicate tradeoffs crisply to execs and engineers.
AI fluency: Curious adaptable and proactive in exploring AI that responsibly improves scale outcomes.

Nice to Haves

Scale scorecard: Companywide fitness functions (latency/throughput/error rates) are adopted and reviewed regularly.
Highimpact wins: 23 bottlenecks removed with documented reproducible testbeds; pXX latencies and error rates improve on top enterprise workloads; repeat P0s trend down.
AIassisted scale engineering: AIdriven anomaly detection reduces alert noise while improving signal; generative load testing and copilot runbooks are used in release/readiness checks for the top critical services; timetoisolate regressions drops 2030%.

Success in 612 Months

Companywide scale scorecard in place; 23 highimpact bottlenecks removed; top enterprise workloads show improved pXX latencies and error rates; fewer repeat P0s.

We use Covey as part of our hiring and / or promotional process. For jobs or candidates in NYC certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 3 2025.

Please see the independent bias audit report covering our use of Covey here

Required Experience:

Staff IC

What Youll Do

Define enterprise scalability fitness functions (latency/throughput/error rates) and a scorecard; align teams to SLOs and budgets.
Design/implement sharding and partitioning strategies caching/backpressure multiregion readiness and highvolume migration paths.
Build lightweight enablement: benchmarks profiling harnesses reproducible testbeds; pair with teams to land fixes.
Lead scalability reviews and readiness gates that acceleratenot blockdelivery; drive incident deep dives tied to systemic fixes.
Communicate clearly to execs and engineers tying technical work to business impact and customer outcomes.
Integrate AI into scale and resiliency workfrom proactive anomaly detection to synthetic load and guided runbooksso performance improvements stick and incidents dont repeat.

Who You Are

Experience: 12 years scaling multitenant SaaS with a reputation for removing major bottlenecks and proving impact with data.
Technical expertise: Performance engineering capacity planning sharding/partitioning caching/backpressure multiregion readiness and highvolume migrations; you turn hotspots into robust patterns.
AI tools & automation: You apply AI to scale workprofiling assistance workload modeling synthetic traffic generation anomaly detection and runbook copilotsalways with explicit guardrails and observability.
Crossorg influence: You align teams through fitness functions scorecards and readiness gates that acceleratenot blockdelivery; you communicate tradeoffs crisply to execs and engineers.
AI fluency: Curious adaptable and proactive in exploring AI that responsibly improves scale outcomes.

Nice to Haves

Scale scorecard: Companywide fitness functions (latency/throughput/error rates) are adopted and reviewed regularly.
Highimpact wins: 23 bottlenecks removed with documented reproducible testbeds; pXX latencies and error rates improve on top enterprise workloads; repeat P0s trend down.
AIassisted scale engineering: AIdriven anomaly detection reduces alert noise while improving signal; generative load testing and copilot runbooks are used in release/readiness checks for the top critical services; timetoisolate regressions drops 2030%.

Success in 612 Months

Companywide scale scorecard in place; 23 highimpact bottlenecks removed; top enterprise workloads show improved pXX latencies and error rates; fewer repeat P0s.

Please see the independent bias audit report covering our use of Covey here

Required Experience:

Staff IC

Key Skills

Continuous Integration
Docker
Jenkins
Python
System Design
Agile
C/C++
Go
Systems Engineering
Software Development
Java
Distributed Systems

Apply Now

About Company

Klaviyo

Klaviyo unifies AI-powered email marketing and SMS to drive growth, retention, and measurable results. Build personalized, omnichannel experiences across WhatsApp, ecommerce, and more with K:AI Agents.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click