100% REMOTE
EST HOURS
COMMUNICATION SKILLS MUST BE 10/10
The Cloud Data Platform Engineer will play a central role in the deployment monitoring and optimization of cloud platforms used by our data scientists. This role requires hands-on expertise in cloud infrastructure modern DevOps practices secure data operations and automation frameworks. Youll partner closely with data scientists machine learning engineers and data engineers to ensure our analytics systems run securely efficiently and at scale. Youll work with our Platform Engineering team to define and implement these patterns so they can be codified and reused across our organization.
Location: Remote preferably US-based
Primary Responsibilities
Cloud Infrastructure & Operations
- Manage scale and optimize cloud environments used for data science workloads (primarily AWS Databricks dbt).
- Provision maintain and optimize compute clusters for ML workloads (e.g. Kubernetes ECS/EKS Databricks SageMaker).
- Implement and maintain high-availability solutions for mission-critical analytics platforms.
DevOps & Automation
- Develop CI/CD pipelines for model deployment infrastructure-as-code (IaC) and automated testing using industry standard toolchains
- Build monitoring alerting and logging systems for cloud and ML infrastructure (e.g. Datadog CloudWatch Prometheus Grafana ELK).
- Automate provisioning configuration and deployments using tools such as Terraform and CloudFormation GitHub actions etc.
Data Platform Support
- Enable and improve data ingestion transformation and model execution workflows through platform capabilities and automation.
- Develop and maintain self-service capabilities for data scientists to provision and manage reliable reproducible environments for research and development.
- Collaborate with Data Engineering to maintain integrations between data pipelines and cloud systems.
- Share responsibility for provisioning and operating application networking capabilities that support data platforms including API gateways CDNs application load balancers TLS and WAFs.
Security Compliance & Governance
- Implement and operationalize data science security and compliance controls for data science platforms in alignment with enterprise cloud standards.
- Conduct periodic risk assessmentsbest practice reviews and remediation efforts to strengthen security and resiliency.
- Support secure handling of sensitive financial data.
Cross-Functional Collaboration
- Partner with data scientists machine learning engineers and data engineers to deeply understand and support their needs and workflows within data-driven initiatives.
- Serve as a technical advisor on cloud architecture performance optimization and production readiness for data and ML platforms.
- Adopt and champion Agile DevOps and Platform Engineering practices (kanban scrum continuous improvement automation Everything-as-a-Service)
- Demonstrate a strong proactive focus on serving internal customers prioritizing user experience identifying opportunities to leverage automation and self-service to reduce toil and cognitive load for developers and researchers.
Requirements
Education & Certificates
- A bachelors degree or higher in a STEM field required
Professional Experience
- 5 years of experience in cloud operations DevOps platform engineering SRE sysadmin or related roles.
- Strong proficiency with at least one major cloud provider (AWS preferred).
- Hands-on experience with IaC tools (Terraform CloudFormation or similar).
- Strong scripting skills (Python Bash or PowerShell).
- Strong understanding of modern authentication and authorization technologies and secrets management (IAM OIDC OAuth2 RBAC ABAC privileged access management JIT authorization PKI).
- Experience with common CI/CD systems (GitHub Actions Jenkins GitLab CI ArgoCD or similar).
- Familiarity with container orchestration (Docker Compose EKS/Kubernetes ECS).
- Experience supporting data-intensive or ML workloads.
Preferred
- Experience in financial services investment management or other highly regulated industries.
- Knowledge of ML/AI platform tools (Databricks SageMaker MLflow Airflow).
- Hands-on experience with AI Engineering and LLMOps tools (LLM observability eval pipelines building/supporting agentic workflows) are a huge plus.
- Understanding of networking VPC architectures and cloud security best practices.
- Familiarity with distributed compute frameworks (Spark Ray Dask).
100% REMOTE EST HOURS COMMUNICATION SKILLS MUST BE 10/10 The Cloud Data Platform Engineer will play a central role in the deployment monitoring and optimization of cloud platforms used by our data scientists. This role requires hands-on expertise in cloud infrastructure modern DevOps practices ...
100% REMOTE
EST HOURS
COMMUNICATION SKILLS MUST BE 10/10
The Cloud Data Platform Engineer will play a central role in the deployment monitoring and optimization of cloud platforms used by our data scientists. This role requires hands-on expertise in cloud infrastructure modern DevOps practices secure data operations and automation frameworks. Youll partner closely with data scientists machine learning engineers and data engineers to ensure our analytics systems run securely efficiently and at scale. Youll work with our Platform Engineering team to define and implement these patterns so they can be codified and reused across our organization.
Location: Remote preferably US-based
Primary Responsibilities
Cloud Infrastructure & Operations
- Manage scale and optimize cloud environments used for data science workloads (primarily AWS Databricks dbt).
- Provision maintain and optimize compute clusters for ML workloads (e.g. Kubernetes ECS/EKS Databricks SageMaker).
- Implement and maintain high-availability solutions for mission-critical analytics platforms.
DevOps & Automation
- Develop CI/CD pipelines for model deployment infrastructure-as-code (IaC) and automated testing using industry standard toolchains
- Build monitoring alerting and logging systems for cloud and ML infrastructure (e.g. Datadog CloudWatch Prometheus Grafana ELK).
- Automate provisioning configuration and deployments using tools such as Terraform and CloudFormation GitHub actions etc.
Data Platform Support
- Enable and improve data ingestion transformation and model execution workflows through platform capabilities and automation.
- Develop and maintain self-service capabilities for data scientists to provision and manage reliable reproducible environments for research and development.
- Collaborate with Data Engineering to maintain integrations between data pipelines and cloud systems.
- Share responsibility for provisioning and operating application networking capabilities that support data platforms including API gateways CDNs application load balancers TLS and WAFs.
Security Compliance & Governance
- Implement and operationalize data science security and compliance controls for data science platforms in alignment with enterprise cloud standards.
- Conduct periodic risk assessmentsbest practice reviews and remediation efforts to strengthen security and resiliency.
- Support secure handling of sensitive financial data.
Cross-Functional Collaboration
- Partner with data scientists machine learning engineers and data engineers to deeply understand and support their needs and workflows within data-driven initiatives.
- Serve as a technical advisor on cloud architecture performance optimization and production readiness for data and ML platforms.
- Adopt and champion Agile DevOps and Platform Engineering practices (kanban scrum continuous improvement automation Everything-as-a-Service)
- Demonstrate a strong proactive focus on serving internal customers prioritizing user experience identifying opportunities to leverage automation and self-service to reduce toil and cognitive load for developers and researchers.
Requirements
Education & Certificates
- A bachelors degree or higher in a STEM field required
Professional Experience
- 5 years of experience in cloud operations DevOps platform engineering SRE sysadmin or related roles.
- Strong proficiency with at least one major cloud provider (AWS preferred).
- Hands-on experience with IaC tools (Terraform CloudFormation or similar).
- Strong scripting skills (Python Bash or PowerShell).
- Strong understanding of modern authentication and authorization technologies and secrets management (IAM OIDC OAuth2 RBAC ABAC privileged access management JIT authorization PKI).
- Experience with common CI/CD systems (GitHub Actions Jenkins GitLab CI ArgoCD or similar).
- Familiarity with container orchestration (Docker Compose EKS/Kubernetes ECS).
- Experience supporting data-intensive or ML workloads.
Preferred
- Experience in financial services investment management or other highly regulated industries.
- Knowledge of ML/AI platform tools (Databricks SageMaker MLflow Airflow).
- Hands-on experience with AI Engineering and LLMOps tools (LLM observability eval pipelines building/supporting agentic workflows) are a huge plus.
- Understanding of networking VPC architectures and cloud security best practices.
- Familiarity with distributed compute frameworks (Spark Ray Dask).
View more
View less