Reliability Engineer Denver Colorado Full Time / Direct Hire
Our team is a talented group of open source developers creating a successful and sustainable business that will impact the entire globe. We are creating open source enterprisegrade products that help individuals and organizations unlock their potential to become top performers in their respective fields. We are building tools that encompass and support the full range / span of the web development life cycle.
Description You will be working in a fast paced dynamic development environment designing developing and delivering our dev and hosting products. You will help maintain a 24x7 uptime on public cloud infrastructure. Be first responder during outages for clients with managed hosting and self hosting. Contribute to the design and maintenance with regard to logging networking monitoring security and disaster recovery.
Requirements
Experience managing production Kubernetes Clusters
Fluent in one programming language such as Python GoLang or Ruby
Experience with a blend of knowledge including DevOps SRE or Systems Operations
Experience managing Linux based servers. CoreOS (a plus)
Understanding of Containers
Troubleshooting systems networks and code
Solid knowledge of system performance and monitoring
Bonus Experience
Experience with federated Kubernetes clusters
Experience with large cloud hosting providers such as AWS GCP and Azure
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.