- Define and lead infrastructure and reliability strategy across the platform
- Design scalable resilient systems in collaboration with engineering teams
- Optimize build testing and deployment processes for speed and stability
- Establish and uphold best practices for CI/CD monitoring and observability
- Lead incident response and drive continuous improvement postincident
- Automate workflows to reduce operational toil and risk
- Mentor engineers and foster a culture of operational excellence
- Make strategic buildvsbuy decisions balancing speed quality and sustainability
Qualifications :
- At least 8 years of experience in Site Reliability Engineering or DevOps roles including 2 years in a Principal or Lead position
- Proven experience in infrastructure modernization and scaling initiatives for highgrowth environments
- Strong proficiency in Python
- Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS
- Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite
- Proficiency in infrastructureascode tools such as Terraform
- Strong knowledge of monitoring observability and performance optimization practices
- Upper-Intermediate level of spoken and written English
WOULD BE A PLUS
- Experience with monorepos (Turborepo pnpm)
- Familiarity with modern TypeScript tools (swc biome oxc)
- Knowledge of NestJS NextJS and testing frameworks (Jest Vitest)
Additional Information :
PERSONAL PROFILE
- Excellent leadership communication and decisionmaking abilities
- Ability to work independently and make pragmatic buildvsbuy decisions in fastpaced environments
Remote Work :
Yes
Employment Type :
Full-time
Define and lead infrastructure and reliability strategy across the platformDesign scalable resilient systems in collaboration with engineering teamsOptimize build testing and deployment processes for speed and stabilityEstablish and uphold best practices for CI/CD monitoring and observabilityLead in...
- Define and lead infrastructure and reliability strategy across the platform
- Design scalable resilient systems in collaboration with engineering teams
- Optimize build testing and deployment processes for speed and stability
- Establish and uphold best practices for CI/CD monitoring and observability
- Lead incident response and drive continuous improvement postincident
- Automate workflows to reduce operational toil and risk
- Mentor engineers and foster a culture of operational excellence
- Make strategic buildvsbuy decisions balancing speed quality and sustainability
Qualifications :
- At least 8 years of experience in Site Reliability Engineering or DevOps roles including 2 years in a Principal or Lead position
- Proven experience in infrastructure modernization and scaling initiatives for highgrowth environments
- Strong proficiency in Python
- Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS
- Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite
- Proficiency in infrastructureascode tools such as Terraform
- Strong knowledge of monitoring observability and performance optimization practices
- Upper-Intermediate level of spoken and written English
WOULD BE A PLUS
- Experience with monorepos (Turborepo pnpm)
- Familiarity with modern TypeScript tools (swc biome oxc)
- Knowledge of NestJS NextJS and testing frameworks (Jest Vitest)
Additional Information :
PERSONAL PROFILE
- Excellent leadership communication and decisionmaking abilities
- Ability to work independently and make pragmatic buildvsbuy decisions in fastpaced environments
Remote Work :
Yes
Employment Type :
Full-time
View more
View less