Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailXebia is a trusted advisor in the modern era of digital transformation serving hundreds of leading brands worldwide with end-to-end IT solutions. The company has experts specializing in technology consulting software engineering AI digital products and platforms data cloud intelligent automation agile transformation and industry digitization. In addition to providing high-quality digital consulting and state-of-the-art software development Xebia has a host of standardized solutions that substantially reduce the time-to-market for businesses.
Xebia also offers a diverse portfolio of training courses to help support forward-thinking organizations as they look to upskill and educate their workforce to capitalize on the latest digital capabilities. The company has a strong presence across 16 countries with development centres across the US Latin America Western Europe Poland the Nordics the Middle East and Asia Pacific.
1. Reliability: Enhance service reliability by reducing incidents improving stability and ensuring high availability.
2. Observability: Strengthen system visibility to reduce MTTR enhance proactive detection and drive data-driven operations.
3. Design build and maintain scalable and reliable AWS infrastructure. Strong hands-on experience with AWS services ((EC2 ECS Serverless S3 VPC DynamoDB.).
4. Implement and manage CI/CD pipelines using tools like AWS CodePipeline (preferred) or GitHub Actions.
5 Build and maintain observability stacks using tools like Splunk Prometheus SignalFx New Relic.
6. Automate operational tasks using Infrastructure as Code (IaC) tools such as Terraform or AWS CloudFormation.
7. Manage containerized workloads using Docker and Kubernetes (EKS preferred).
8. Define and track SLIs SLOs and SLAs to ensure service reliability.
9. Design High Availability (HA) and Disaster Recovery (DR) patterns incorporating auto-scaling active-passive failover and retry-safe microservices across regions.
10. Demonstrated contributions/metrics showing progression on organization-wide SRE initiatives (ex. Observability As Code Resilience Engineering High Availability Disaster Recovery Toil Reduction CUJ & SLO) and internal SRE engineering roadmap where applicable.
11. Proficiency in any scripting languages like Python Bash or Go.
Good to Have
of Linux Networking security and system administration.
of Event-Driven Architecture (Kafka/SQS)
of backend technologies like Java NodeJs
Certifications expected:
Terraform Associate Certificate
AWS Certified Solutions Architect
AWS Certified Developer
Some useful links:
Xebia Creating Digital Leaders.
Experience:
Senior IC
Full Time