Principal Site Reliability Engineer
Job Summary
- Define and lead infrastructure and reliability strategy across the platform
- Design scalable resilient systems in collaboration with engineering teams
- Optimize build testing and deployment processes for speed and stability
- Establish and uphold best practices for CI/CD monitoring and observability
- Lead incident response and drive continuous improvement postincident
- Automate workflows to reduce operational toil and risk
- Mentor engineers and foster a culture of operational excellence
- Make strategic buildvsbuy decisions balancing speed quality and sustainability
Qualifications :
- At least 8 years of experience in Site Reliability Engineering or DevOps roles including 2 years in a Principal or Lead position
- Proven experience in infrastructure modernization and scaling initiatives for highgrowth environments
- Strong proficiency in Python
- Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS
- Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite
- Proficiency in infrastructureascode tools such as Terraform
- Strong knowledge of monitoring observability and performance optimization practices
- Upper-Intermediate level of spoken and written English
WOULD BE A PLUS
- Experience with monorepos (Turborepo pnpm)
- Familiarity with modern TypeScript tools (swc biome oxc)
- Knowledge of NestJS NextJS and testing frameworks (Jest Vitest)
Additional Information :
PERSONAL PROFILE
- Excellent leadership communication and decisionmaking abilities
- Ability to work independently and make pragmatic buildvsbuy decisions in fastpaced environments
Remote Work :
Yes
Employment Type :
Full-time
Key Skills
- Kubernetes
- FMEA
- Continuous Improvement
- Elasticsearch
- Go
- Root cause Analysis
- Maximo
- CMMS
- Maintenance
- Mechanical Engineering
- Manufacturing
- Troubleshooting
About Company
At Sigma Software, we are involved with the clients team to contribute to the design and development of a technical solution for their tokenized domain reservation platform. We started by assigning a software architect to design the smart contracts and integrate blockchain into the s ... View more