Powering the Future with AIDA
To lead the next phase of our AI evolution weve launched a new business unitAIDAArtificial Intelligence & Data Analytics a strategic engine driving our transformation designed to scale our AI ambitions with precision and purpose.This marks apivotal shiftin how we operate innovate and serve to embed intelligence into every layer of our business.
AtSingtel this is more than a technology upgrade. Its astrategic transformationthat redefines how value is created across the enterprise coreaugmenting human capabilitiesand unlocking entirely new potential. It is a transformation journey by aligningpeople platforms and processesunder one cohesive strategy. Our mission is to buildAI literacy and foster a culture whereintelligence empowers people.
We welcome you to join uson a transformational journey thats reshaping the telecommunications industry and redefining whats possible with AI at its core.Grow with usin a workplace that championsinnovation embracesagility and putshuman potentialat the heart of everything we do.
Be a Part of Something BIG!
We are seeking an experienced Lead AI Infrastructure Architect to design and oversee the enterprise wide AI infrastructure that supports model training fine tuning inference and agentic AI workloads. You will be responsible for shaping the overall architecture for on premise and cloud based AI services including platform components GPU infrastructure container orchestration networking security and operational readiness.
This role requires deep expertise in AI system architecture hands on familiarity with open source AI infrastructure components and strong capability to translate AI and business requirements into scalable and secure platform designs. You will work closely with solution architects AI engineers platform teams cyber and system security and cloud teams to ensure the organisation has a robust foundation for current and future AI capabilities.
Make an Impact by:
- Lead the design of end to end AI infrastructure architecture covering on premise and cloud environments for model training inference RAG pipelines and agentic AI workloads.
- Define the technical blueprint for AI platform services including container orchestration data pathways network topology GPU clusters storage and observability.
- Architect and guide the setup of Red Hat OpenShift environments (or equivalent) for AI workloads including Ray Kubeflow MLflow or similar distributed ML frameworks and experience with vLLM other inference and serving engine.
- Design and integrate cloud based AI infrastructure e.g. Azure AWS or GCP) including compute GPU architecture networking IAM and data access patterns.
- Oversee infrastrucre capacity planning for GPU and CPU clusters including utilisation monitoring cost modelling and optimisation.
- Lead design reviews for AI infrastructure proposals from platform teams and vendors to ensure compliance with enterprise architecture principles.
- Work with networking teams to define connectivity requirements zero trust boundaries firewall rules load balancing and traffic engineering for AI systems.
- Collaborate with cyber and information security to ensure platform hardening identity management data protection and secure use of open source components.
- Oversee observability and operational readiness for AI infrastructure including logging metrics tracing GPU health versioning and rollback strategy.
- Support engineering teams by designing deployment patterns for Ray clusters Kubeflow pipelines fine tuning services and high availability model serving.
- Drive alignment across data platform cloud engineering DevSecOps and solution architects on the AI infrastructure roadmap and integration approach.
- Advise leadership on new technologies open source adoption performance benchmarks and emerging AI infrastructure patterns.
- Guide the migration of existing systems to modern AI platform architecture while ensuring performance and minimal operational disruption.
- Define disaster recovery and business continuity strategy for AI platform components.
- Ensure AI infrastructure designs support future scale reliability security and extensibility for enterprise use.
- Run proof of concept studies to validate new infrastructure solutions evaluate GPU stack performance and compare open source frameworks.
Skills for Success:
- Bachelor or Master degree in Computer Science Engineering Information Systems or a related technical field.
- Candidates with specialisation in infrastructure architecture cloud platform or distributed systems are preferred.
- More than 8 years of experience in infrastructure on-premise and cloud architecture with strong exposure to distributed systems container orchestration and AI platform design.
- At least 3 years in a senior architect role designing large scale platforms or AI infrastructure.
- Strong knowledge of AI infrastructure including model hosting distributed inference GPU clusters and LLM performance optimisation.
- Hands on experience with Red Hat OpenShift (or equivalent) for container orchestration and AI workload deployment.
- Familiarity with Ray Kubeflow MLflow or similar distributed ML frameworks and experience with vLLM other inference and serving engine.
- Deep understanding of cloud computing architecture including compute storage networking load balancing autoscaling and IAM.
- Strong understanding of network architecture including routing security zones firewall rules and service mesh patterns.
- Experience designing secure API integration through MCP API gateways or service mesh.
- Strong capability in infrastructure as code DevSecOps practices CI and CD and automation frameworks.
- Experience designing observability stack for AI infrastructure including logs metrics events and tracing.
- Familiar with enterprise standards for data access encryption compliance and model safety controls.
- Strong analytical and system thinking capability with the ability to simplify complex architecture for stakeholders.
- Collaborative mindset and ability to work with cross functional teams across cloud platform cybersecurity and engineering.
- Strong communication and documentation skills when presenting architecture decisions and design principles.
- Ability to guide engineers influence platform direction and provide clear technical leadership.
- Passion to stay updated on AI infrastructure innovations and new open source technologies.
Are you ready to say hello to BIG Possibilities
Take the leap with Singtel to unlock new opportunities and accelerate your growth. Apply now and start your empowering career!