About Singtel Digital InfraCo RE:AI
Singtel Digital InfraCos RE:AI division is building Asias most advanced and sustainable AI infrastructure ecosystem. RE:AI enables enterprises research institutions and digital-native businesses to accelerate innovation through responsible high-performance AI compute and connectivity solutions.
Be a Part of Something BIG!
As an Operations Engineer supporting Singtels GPU-as-a-Service (GPUaaS) platform you will contribute to the implementation integration and day-to-day operations of data centre environments that enable customers AI and High-Performance Computing (HPC) workloads. The role involves exposure to both physical data centre operations and supporting software systems used in GPU-oriented facilities. This role offers opportunities to build and deepen expertise in advanced data centre technologies for AI and HPC environments within a dynamic and continuously evolving operational setting.
Responsibilities:
Data Centre Operations Management
- Respond to attend to and escalate incidents based on defined criticality impact and service level agreements (SLAs).
- Perform hands-on operations involving air-cooled and liquid-cooled systems as well as electrical systems within the data centre environment.
- Participate actively in continuous improvement initiatives for operational processes with consideration of GPU-oriented data centre requirements.
- Coordinate and obtain necessary security clearances for visitors and vendors accessing the GPUaaS data centre.
- Manage vendor activities and ensure compliance with Workplace Safety and Health (WSH) requirements and site regulations.
- Participate in scheduled or on-call support outside standard working hours including nights weekends and public holidays as required.
Data Centre Facilities Management
- Monitor data centre facilities and infrastructure across upstream and downstream systems (e.g. power cooling leakage detection environmental controls).
- Maintain and update data centre documentation including preparation of operational and incident reports as required.
- Coordinate with internal and external stakeholders to resolve technical and process-related issues within the GPUaaS data centre.
- Ensure adherence to established Standard Operating Procedures (SOPs) Methods of Procedure (MOPs) and Emergency Response Procedures (ERPs).
- Apply knowledge of power and cooling requirements for air-cooled and liquid-cooled servers to support operational enhancements and capacity planning.
- Coordinate maintenance activities and system shutdowns with stakeholders and vendors to ensure system reliability and availability.
- Prepare monthly Facilities Management reports on overall data centre health and performance.
- Identify potential workplace safety and health risks within the data centre environment.
- Conduct visual inspections of servers and cooling distribution units.
- Perform server troubleshooting in collaboration with remote engineering teams.
Requirements
- Diploma in Mechanical Engineering Electrical Engineering Building Services or a related discipline.
- Broad understanding of data centre electrical and mechanical infrastructure including fire safety systems building management systems (BMS) equipment maintenance and space planning.
- Experience in maintaining and operating data centre equipment with emphasis on electrical and mechanical systems.
- Ability to work effectively both independently and as part of a team.
- Organised adaptable and able to respond to changing operational requirements and schedules.
- Demonstrated willingness to learn and develop skills in GPU-oriented and mission-critical data centre technologies.
Rewards that Go Beyond
- Flexible work arrangements
- Full suite of health and wellness benefits
- Ongoing training and development programs
- Internal mobility opportunities
Your Career Growth Starts Here. Apply Now!
Required Experience:
IC
About Singtel Digital InfraCo RE:AISingtel Digital InfraCos RE:AI division is building Asias most advanced and sustainable AI infrastructure ecosystem. RE:AI enables enterprises research institutions and digital-native businesses to accelerate innovation through responsible high-performance AI comp...
About Singtel Digital InfraCo RE:AI
Singtel Digital InfraCos RE:AI division is building Asias most advanced and sustainable AI infrastructure ecosystem. RE:AI enables enterprises research institutions and digital-native businesses to accelerate innovation through responsible high-performance AI compute and connectivity solutions.
Be a Part of Something BIG!
As an Operations Engineer supporting Singtels GPU-as-a-Service (GPUaaS) platform you will contribute to the implementation integration and day-to-day operations of data centre environments that enable customers AI and High-Performance Computing (HPC) workloads. The role involves exposure to both physical data centre operations and supporting software systems used in GPU-oriented facilities. This role offers opportunities to build and deepen expertise in advanced data centre technologies for AI and HPC environments within a dynamic and continuously evolving operational setting.
Responsibilities:
Data Centre Operations Management
- Respond to attend to and escalate incidents based on defined criticality impact and service level agreements (SLAs).
- Perform hands-on operations involving air-cooled and liquid-cooled systems as well as electrical systems within the data centre environment.
- Participate actively in continuous improvement initiatives for operational processes with consideration of GPU-oriented data centre requirements.
- Coordinate and obtain necessary security clearances for visitors and vendors accessing the GPUaaS data centre.
- Manage vendor activities and ensure compliance with Workplace Safety and Health (WSH) requirements and site regulations.
- Participate in scheduled or on-call support outside standard working hours including nights weekends and public holidays as required.
Data Centre Facilities Management
- Monitor data centre facilities and infrastructure across upstream and downstream systems (e.g. power cooling leakage detection environmental controls).
- Maintain and update data centre documentation including preparation of operational and incident reports as required.
- Coordinate with internal and external stakeholders to resolve technical and process-related issues within the GPUaaS data centre.
- Ensure adherence to established Standard Operating Procedures (SOPs) Methods of Procedure (MOPs) and Emergency Response Procedures (ERPs).
- Apply knowledge of power and cooling requirements for air-cooled and liquid-cooled servers to support operational enhancements and capacity planning.
- Coordinate maintenance activities and system shutdowns with stakeholders and vendors to ensure system reliability and availability.
- Prepare monthly Facilities Management reports on overall data centre health and performance.
- Identify potential workplace safety and health risks within the data centre environment.
- Conduct visual inspections of servers and cooling distribution units.
- Perform server troubleshooting in collaboration with remote engineering teams.
Requirements
- Diploma in Mechanical Engineering Electrical Engineering Building Services or a related discipline.
- Broad understanding of data centre electrical and mechanical infrastructure including fire safety systems building management systems (BMS) equipment maintenance and space planning.
- Experience in maintaining and operating data centre equipment with emphasis on electrical and mechanical systems.
- Ability to work effectively both independently and as part of a team.
- Organised adaptable and able to respond to changing operational requirements and schedules.
- Demonstrated willingness to learn and develop skills in GPU-oriented and mission-critical data centre technologies.
Rewards that Go Beyond
- Flexible work arrangements
- Full suite of health and wellness benefits
- Ongoing training and development programs
- Internal mobility opportunities
Your Career Growth Starts Here. Apply Now!
Required Experience:
IC
View more
View less