The Strategic Customers Engineering team (SCE) at OCI is tasked with managing the relationships with some of our most significant AI Infra customers who are the key drivers of our response to rapid market growth across AI Infra business in APAC we are establishing the APAC Strategic Pursuits team to guide Oracles largest and high growth cloud opportunities to success. As the principal AIML solution architects in this teamyou will be helping the most strategic customers in the region with solving technical challenges with OCI.
This position is based in Hong Kong and requires 40%-50% travel across the region.
Responsibilities
Collaborate with GPU sales team and SCE AIML TPM team to provide technical support for customers both at pre-sales and after-sales stage. Take ownership of problems and work to identify solutions.
Design deploy and manage infrastructure components such as cloud resources distributed computing systems and data storage solutions to support AI/ML workflows.
Collaborate with customers scientists and software/infrastructure engineers to understand infrastructure requirements for training testing and deploying machine learning models.
Implement automation solutions for provisioning configuring and monitoring AI/ML infrastructure to streamline operations and enhance productivity.
Optimize infrastructure performance by tuning parameters optimizing resource utilization and implementing caching and data pre-processing techniques.
Troubleshoot infrastructure performance scalability and reliability issues and implement solutions to mitigate risks and minimize downtime.
Stay updated on emerging technologies and best practices in AI/ML infrastructure and evaluate their potential impact on our systems and workflows.
Document infrastructure designs configurations and procedures to facilitate knowledge sharing and ensure maintainability.
Qualifications:
Experience in scripting and automation using tools like Ansible Terraform and/or Kubernetes. Experience with containerization technologies (e.g. Docker Kubernetes) and orchestration tools for managing distributed systems.
Solid understanding of networking concepts security principles and best practices.
Excellent problem-solving skills with the ability to troubleshoot complex issues and drive resolution in a fast-paced environment.
Strong communication and collaboration skills with the ability to work effectively in cross-functional teams and convey technical concepts to non-technical stakeholders.
Strong documentation skills with experience documenting infrastructure designs configurations procedures and troubleshooting steps to facilitate knowledge sharing ensure maintainability and enhance team collaboration.
Strong Linux skills with hands-on experience in Oracle Linux/RHEL/CentOS Ubuntu and Debian distributions including system administration package management shell scripting and performance optimization.
Career Level - IC4
Required Experience:
Staff IC
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more