Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailGlowingbud is a rapidly growing eSIM services platform that simplifies connectivity with powerful APIs robust B2B and B2C interfaces and seamless integrations with Telna. Our platform enables global eSIM lifecycle management user onboarding secure payment systems and scalable deployments. Recently acquired by Telna we are expanding our product offerings and team to meet increasing demand and innovation goals.
Job Summary
We are seeking a highly experienced Senior DevOps Engineer with 10 years of expertise in cloud infrastructure automation and system reliability. The ideal candidate will be responsible for maintaining scalable AWS-based environments implementing robust CI/CD pipelines optimizing system performance and ensuring high availability of critical applications. This role requires deep expertise in Docker Kubernetes Infrastructure as Code (IaC) and system monitoring. The candidate will also be responsible for documenting system architecture setting SLAs and leading DevOps best practices across teams. If you thrive in a fast-paced collaborative environment and are passionate about DevOps wed love to hear from you!
Key Responsibilities:
Infrastructure Management: Design implement and maintain scalable cloud infrastructure using AWS services.
System Documentation & Diagrams: Maintain up-to-date system diagrams architecture documentation and operational procedures.
Containerization & Orchestration: Deploy and manage containerized applications using Docker and Kubernetes.
System Maintenance & Optimization: Ensure high availability performance tuning and cost optimization of cloud and on-premise infrastructure.
Monitoring & Observability: Implement detailed system monitoring logging and alerting using tools like Datadog Prometheus Grafana ELK stack or AWS CloudWatch.
Security & Compliance: Enforce security best practices conduct regular audits and ensure adherence to compliance standards.
CI/CD Pipeline Management: Build and maintain automated deployment pipelines for seamless application releases.
Incident Response & SLA Management: Define SLAs monitor system performance and establish an efficient incident response strategy.
Collaboration & Leadership: Work closely with development QA and operations teams to improve reliability scalability and efficiency.
Qualifications:
10 years of experience in DevOps Site Reliability Engineering (SRE) or Cloud Infrastructure roles.
Expert knowledge of AWS Services (EC2 ECS S3 RDS Mongo Atlas Lambda VPC ALB Gateway Cognito WAF IAM Amplify CloudFormation etc..
Strong experience with Docker & Kubernetes for container orchestration and management.
Hands-on experience with infrastructure as code (IaC) tools like Terraform CloudFormation or Pulumi.
Expertise in system monitoring and logging tools (Prometheus Grafana ELK Stack Datadog AWS CloudWatch).
Proficiency in scripting languages (Bash Python or Go) for automation and infrastructure management.
Experience with CI/CD pipelines using Jenkins AWS CodePipeline GitHub Actions.
Knowledge of networking security best practices and system performance tuning.
Experience with setting and enforcing SLAs for DevOps teams.
Strong problem-solving skills and ability to work in a fast-paced environment.
Preferred Skills:
Thorough Experience with AWS Infrastructure.
Knowledge of serverless architectures and event-driven computing.
Experience with configuration management tools (Ansible Chef Puppet).
Background in database administration (PostgreSQL MySQL or NoSQL databases).
datadog,aws services,aws codepipeline,aws cloudwatch,postgresql,prometheus,networking,grafana,infrastructure as code (iac),python,go,scripting languages (bash, python, go),mysql,bash,system monitoring,nosql,pulumi,github actions,kubernetes,system performance tuning,ci/cd pipelines,jenkins,devops,security best practices,terraform,elk stack,logging tools,cloudformation,aws,docker
Full Time