Amazon Lab126 is an inventive research and development company that designs and engineers high-profile consumer electronics. Lab126 began in 2004 as a subsidiary of Inc. originally creating the best-selling Kindle family of products. Since then Lab126 has produced devices like Fire tablets Fire TV Amazon Echo and Dash Button. The Device OS team is a big part of creating these innovative devices at Lab126 providing the core OS platform features and components.
In the Device OS group we are inventing the future for consumer electronics and are looking for a System Development Engineer III to help us bring the vision into reality and solve real world challenges that will transform our customers experiences in ways we cant even imagine yet. The team develops scalable cloud solutions that enable our partners to quickly build and launch services/ devices quickly pin a cost-effective way. If you love to be hands on designing and implementing quality platform for our consumer electronic devices while working with a world class highly accomplished team we would love to talk with you.
This role is specially for engineers who have extensive experience in DevOps/ SRE roles and at the same time meeting SysDev guideline.
As a System Development Engineer III you will technically contribute to a complex charter of building and delivering T1 cloud services for Device OS. You will implement initiatives defined as part of roadmap of multi year business critical cloud technology solution which will be rolled out of multiple devices across millions of devices. Work on delivering technical initiatives that are defined to drive cost optimization across various AWS environments manage availability latency and performance of our mission critical services and build automation to prevent problem recurrence. You will periodically participate in reviewing capacity planning sizing and optimization of Cloud platform. You will work closely with Platform and application teams to ensure the highest level of quality for the Device OS deliverable.
You will act as a technical leader driving architecture decisions improving system reliability mentoring engineers and partnering with product and development teams to deliver resilient solutions at scale.
Key job responsibilities
Design implement and operate highly available fault-tolerant systems on AWS
Architect solutions using AWS services such as EC2 EKS ECS ALB/NLB API Gateway Lambda RDS DynamoDB S3 CloudFront
Lead design reviews and influence long-term platform and reliability strategy
Define and manage SLIs SLOs SLAs and error budgets
Drive improvements in availability latency performance and scalability
Lead incident response root cause analysis (RCA) and post-incident reviews
Reduce toil through automation and operational best practices
Automation & Infrastructure as Code
Build and maintain Infrastructure as Code (IaC) using Terraform CloudFormation or CDK
Automate deployments scaling and recovery using CI/CD pipelines
Develop internal tools and scripts using Python Java Go or Bash
Implement robust monitoring logging and alerting using CloudWatch
Optimize capacity planning and cost using AWS cost optimization techniques
Ensure systems meet security compliance and operational standards
Leadership & Collaboration
Act as a technical mentor for junior and mid-level engineers
Collaborate with product application and security teams to deliver end-to-end solutions
Influence engineering best practices and operational standards across teams
A day in the life
Design and operate scalable reliable AWS-based systems
Develop automation and tooling using Python/Java/Go
Manage infrastructure using Terraform/CloudFormation/CDK
Monitor production systems and proactively resolve issues
Participate in on-call lead incident response and drive RCAs
Define and improve SLIs SLOs and error budgets
Optimize system performance availability and AWS costs
Collaborate with application teams and mentor engineers
- Bachelors degree
- 8 years of experience in system engineering SRE DevOps or platform engineering
- Strong hands-on experience with AWS cloud services
- Solid programming experience in Python Java Go or similar languages
- Experience with Linux systems networking and distributed systems
- Proven experience handling production systems at scale
- Experience in operating containers in production system
- Strong understanding of microservices architecture
- Experience with multi-region multi-account AWS setups
- AWS certifications (Solutions Architect / DevOps Engineer / SysOps)
- Experience with security best practices IAM and compliance frameworks
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit
for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.