Alexa Sensitive Content Intelligence team leverages natural language understanding machine learning and big data to help Alexa identify sensitive content. Were seeking an exceptional Data Engineer to join our Data Engineering team where youll architect and build data systems that power Amazons Devices content moderation data warehouse and reporting pipeline. This role combines technical expertise with business acumen to transform complex data into actionable insights.
Youll be instrumental in building the next generation of data-driven tools that highlight and protect bad content being exposed to customers on Amazons Devices including Alexa. You should have experience with real-time data processing high-throughput systems and end-to-end platform development. Knowledge of modern data engineering tools and technologies is essential.
The ideal candidate combines technical excellence with strategic thinking bringing both the ability to architect complex systems and the vision to drive innovation in technology.
Core Responsibilities:
Contribute to the architecture design and implementation of next generation BI solutions including streaming data applications.
Manage AWS resources including EC2 RDS Redshift Kinesis EMR Lambda etc.
Collaborate with data scientists BIEs to deliver high quality data architecture and pipelines.
Interface with other technology teams to extract transform and load data from a wide variety of data sources
Continually improve ongoing reporting and analysis processes automating or simplifying self-service support for customers.
Basic Qualifications:
Bachelors degree in computer science engineering mathematics or a related technical discipline
Industry experience in software development data engineering business intelligence data science or related field with a track record of manipulating processing and extracting value from large datasets
Experience using big data technologies (Hadoop Hive Hbase Spark EMR etc.)
Experience working with AWS big data technologies (EMR Redshift S3 AWS Glue Kinesis and Lambda for Serverless ETL)
Knowledge of data management fundamentals and data storage principles
Knowledge of distributed systems as it pertains to data storage and computing
Hands-on experience and advanced knowledge of SQL
Basic scripting skills using Python and Scala
Basic understanding of Machine Learning
Key job responsibilities
Design and implement scalable data infrastructure.
Develop robust data pipelines and analytics processes that enable real-time decision making
Collaborate with Science software engineers and product managers to deliver reliable data solutions
Lead technical initiatives and mentor team members in best practices for data engineering
Create automated systems to replace manual processes and support global expansion
A day in the life
As a member of the team you will be responsible for design development and launch of new features. You will be required to participate with other senior engineers and peers in developing the overall system architecture contribute the review of design and code of peer engineers and mentor freshers/junior engineers. You will also work on high availability security compliance and maintenance of our existing services.
- 3 years of data engineering experience
- Experience with data modeling warehousing and building ETL pipelines
- Experience with SQL
- Experience with AWS technologies like Redshift S3 AWS Glue EMR Kinesis FireHose Lambda and IAM roles and permissions
- Experience with non-relational databases / data stores (object storage document or key-value stores graph databases column-family databases)
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit
for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.