Amazons AGI Information is seeking an exceptional Software Engineer to drive ML systems development in the Amazon Knowledge Graph (AKG) team. AKG is re-inventing knowledge graphs for the LLM-era developing sophisticated ML models and pipelines that enable efficient LLM grounding and power LLM-based customer experiences.
Were looking for candidates who combine strong software engineering fundamentals with practical ML system development experience. Youll need to demonstrate expertise in building scalable fault-tolerant distributed systems with a track record of shipping production services that handle large-scale workloads. While ML engineering skills are important we prioritize candidates who understand professional software engineering practices across the full development lifecycle - from system design and coding standards to testing deployment and operational excellence.
Key job responsibilities - Design develop and maintain ML model serving infrastructure to enable high-throughput low-latency entity resolution predictions in production environments - Collaborate with applied scientists to productionize ML models including implementing model improvements and new architectures for entity matching and deduplication - Develop efficient data processing pipelines to handle large-scale training and inference data for entity resolution models - Support experimentation and A/B testing infrastructure to evaluate model improvements - Work closely with downstream engineering teams to integrate entity resolution capabilities into various product surfaces - Participate in code reviews technical design discussions and sprint planning to ensure high quality software delivery - Strong understanding of ML fundamentals and common optimization techniques - Experience with data processing and ETL pipelines at scale
- 3 years of non-internship professional software development experience - Bachelors degree in computer science or equivalent - 2 years of non-internship design or architecture (design patterns reliability and scaling) of new and existing systems experience - Proven experience designing and implementing scalable fault-tolerant distributed systems with focus on performance reliability and operational excellence
- Masters degree in computer science or equivalent - Strong background in event-driven architectures distributed caching REST APIs vector search and data processing patterns (CDC streaming) - Deep experience with core AWS services including DynamoDB ElastiCache Lambda S3 OpenSearch and infrastructure-as-code (CDK/CloudFormation)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.