Systems Development Engineer, Managed Edge Compute (Amazon Robotics)
Austin, TX - USA
Job Summary
As a SysDE II youll be a strong individual contributor who delivers high-quality technical solutions contributes to architectural discussions and builds reliable systems that enable robotics and automation teams to deploy and manage their edge compute solutions with the same ease as deploying to AWS. Youll work within established technical strategies while identifying opportunities for improvement translating well-scoped business problems into concrete technical solutions and balancing short-term delivery with long-term system health. This role requires solid technical depth across multiple domains Linux systems AWS services IoT platforms robotics compute infrastructure and distributed systems combined with the ability to partner effectively with engineers across the team and organization.
Key job responsibilities
- Build and maintain resilient scalable distributed systems that operate at Amazon scale contributing to the management of robotics device fleets across thousands of sites with 99.99% availability requirements.
- Contribute to the technical strategy for your teams systems within the UWC architecture participating in decisions around hyperscale deployments robotics compute patterns fleet management and edge device automation.
- Participate in architectural reviews and design discussions across UWC and robotics customer teams contributing technical input on device lifecycle management software distribution multi-compute workcell assistance and operational excellence patterns.
- Develop automation solutions using Python Rust CDK and AWS services that eliminate entire classes of operational load and enable self-service for robotics solution teams.
- Implement and optimize Linux-based systems OS image creation pipelines (Yocto/mkosi) and BSP solutions for diverse robotics hardware platforms including x86 ARM NVIDIA GPU systems and embedded devices.
- Create tooling and frameworks that enable robotics teams to provision configure and manage their edge compute fleets from AI perception systems to manipulation robotics with minimal hands-on-keyboard time.
- Apply established standards for engineering testing and operational excellence best practices and suggest improvements to processes within your team.
- Identify and implement opportunities to streamline or eliminate excess processes improving agility and reducing complexity for robotics teams building on UWC.
- Proactively identify and escalate risks at the product and service level contributing to the resilience performance and cost efficiency of UWC systems aiding critical robotics operations.
- Troubleshoot complex production issues across the full stack from robotics device hardware and Linux kernel to AWS cloud services identifying patterns and implementing solutions that prevent future incidents.
- Partner with robotics solution teams (Amazon Robotics manipulation systems AI perception workcell automation) to translate their device management challenges and contribute to solutions that meet their specific requirements.
- Foster the growth of peers on your team through code reviews knowledge exchange and collectively problem-solving that raises the technical bar.
- Deliver solutions that are inventive resilient and extensible making it easier for robotics teams to build on UWC.
- Participate in hiring and contribute to technical assessm
A day in the life
Your day might start by investigating an issue where robotics devices across multiple fulfillment centers are experiencing intermittent kernel panics during high-load operations. You dive deep into kernel logs memory dumps and device telemetry correlating the failures with a recent driver update for NVIDIA GPU systems. You develop a Python or Rust-based diagnostic tool to capture more granular system metrics and partner with senior engineers to roll back the problematic driver version while working on a fix that addresses the underlying memory management issue.
Mid-morning youre troubleshooting why a new OS image isnt booting correctly on ARM-based manipulation robotics devices. You boot into a recovery environment examine the initramfs trace through systemd unit reliances and discover a race condition in the device initialization sequence. You modify the Yocto recipe to fix the boot ordering test across multiple hardware variants and document the pattern for other teams building custom images. You then join a sync with an Amazon Robotics team to help them debug why their software components are failing to deploy walking through IoT certificate validation network linkage from the edge device and AWS IAM permissions until you identify a misconfigured security group.
After lunch youre participating in a code review for a new credential rotation service providing written feedback on error handling patterns memory safety and how to better structure the state machine for resilience. You spend time optimizing a Linux system configuration thats causing performance bottlenecks on AI perception systems configuring and tuning Linux system parameters to enable high-performance compute workloads. You pair with a teammate whos working through a complex Yocto build failure exchanging what you know about layer reliances and BitBake recipe inheritance while partnering on debugging techniques.
The afternoon includes answering to a page where devices in a specific building cant link to AWS IoT Core. You systematically eliminate possibilities checking DNS resolution testing TLS handshakes examining certificate chains and analyzing network packet captures until you discover a misconfigured firewall rule blocking MQTT traffic. You implement a monitoring enhancement to detect this class of issue proactively across all sites. You then contribute to a technical design document proposing improvements to UWCs device provisioning workflow that will reduce provisioning time from 20 minutes to under 10 minutes by parallelizing certificate generation and optimizing the Linux boot sequence. Youll end your day reviewing system metrics across the fleet flagging devices with degraded disk I/O that need proactive maintenance and syncing with your team on priorities for tomorrow.
Amazon offers a full range of benefits that support you and eligible family members including domestic partners. Benefits can vary by location the number of regularly scheduled hours you work length of employment and job status such as seasonal or temporary employment. The benefits that generally apply to regular full-time employees include:
1. Medical Dental and Vision Coverage
2. Maternity and Parental Leave Options
3. Paid Time Off (PTO)
4. 401(k) Plan
If you are not sure that every qualification on the list above describes you exactly wed still love to hear from you! At Amazon we value people with unique backgrounds experiences and skillsets. If youre passionate about this role and want to make an impact on a global scale please apply!
About the team
The Unified Workcell Compute (UWC) team is at the forefront of Amazons robotics and automation efforts building and operating the foundational device management platform for Amazons on-premise edge compute fleet. Our services manage over a million robotic devices across thousands of locations worldwide - from the latest NVIDIA GPU offerings enabling AI perception efforts to bleeding-edge manipulation robotics systems industrial PCs thin clients Drive Units and embedded devices across Amazons global fulfillment network.
Our mission is to enable robotics solution teams to deploy to Operations buildings with the same self-service ownership and accountability as deploying to AWS cloud. Were revolutionizing Amazons logistics and fulfillment operations by pushing the boundaries of whats possible in automation and compute management at unprecedented scale.
Were a team of builders who value automation operational excellence and customer obsession. We own a critical technology ecosystem that powers device provisioning software distribution credential management and fleet operations for robotics workcells and fulfillment systems. Our work directly impacts millions of customer orders and enables Amazons promise to fast reliable delivery. Were solving problems that few organizations face building systems that have never existed before and defining the future of edge compute management for robotics at Amazon scale.
We foster a culture that encourages personal and professional growth empowering our team members to continually expand their skills and knowledge. Work-life balance is a priority for us and we strive to create an environment where our team can thrive both professionally and personally.
- Experience in automating deploying and supporting large-scale infrastructure
- Experience programming with at least one modern language such as Python Ruby Golang Java C C# Rust
- Experience with Linux/Unix
- Experience with CI/CD pipelines build processes
- Experience with distributed systems at scale
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience qualifications and location. Amazon also offers comprehensive benefits including health insurance (medical dental vision prescription Basic Life & AD&D insurance and option for Supplemental life plans EAP Mental Health Support Medical Advice Line Flexible Spending Accounts Adoption and Surrogacy Reimbursement coverage) 401(k) matching paid time off and parental leave. Learn more about our benefits at TX Austin - 129200.00 - 174800.00 USD annually
Required Experience:
IC
About Company
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa Devices, sporting goods, toys, automotive ... View more