- Own end to end support for Domino Data Lab GCP Dataproc Galileo and adjacent ML platforms.
- Perform installation upgrades configuration patching and environment maintenance.
- Monitor cluster health resource utilization job execution performance and alerts.
- Troubleshoot ML workloads involving Spark Python R GPUs containers and orchestrators based on the JIRA tickets (SLAs are very much applicable)
- Manage access security policies service accounts and platform governance.
Ensure high availability optimal performance and adherence to operational SLAs
Own end to end support for Domino Data Lab GCP Dataproc Galileo and adjacent ML platforms. Perform installation upgrades configuration patching and environment maintenance. Monitor cluster health resource utilization job execution performance and alerts. Troubleshoot ML workloads involving S...
- Own end to end support for Domino Data Lab GCP Dataproc Galileo and adjacent ML platforms.
- Perform installation upgrades configuration patching and environment maintenance.
- Monitor cluster health resource utilization job execution performance and alerts.
- Troubleshoot ML workloads involving Spark Python R GPUs containers and orchestrators based on the JIRA tickets (SLAs are very much applicable)
- Manage access security policies service accounts and platform governance.
Ensure high availability optimal performance and adherence to operational SLAs
View more
View less