Dev / ML Ops
Base Salary: $120k$150k
Remote within Canada
Job Overview:
We are seeking a highly skilled DevOps Engineer to manage and optimize our AWS cloud infrastructure while supporting ML Ops initiatives. This role will focus on ensuring our cloud systems are secure scalable and efficient while also enabling seamless deployment and operation of machine learning workflows.
Key Responsibilities:
- Cloud Infrastructure Management: Design implement and maintain robust scalable and costefficient cloud solutions on AWS.
- Automation & CI/CD: Build and maintain CI/CD pipelines to automate infrastructure provisioning application deployments and system monitoring.
- Monitoring & Optimization: Develop monitoring solutions to ensure performance reliability and costeffectiveness of cloud infrastructure.
- Security: Implement cloud security best practices including IAM network configurations and encryption strategies.
- ML Ops Support: Collaborate with AI team and engineers to operationalize machine learning models ensuring smooth integration into production systems.
- Containerization & Orchestration: Use tools like Docker to containerize applications and manage clusters effectively.
- Collaboration: Partner with software developers data engineers and other stakeholders to streamline workflows and ensure infrastructure aligns with business needs.
- Documentation: Maintain comprehensive documentation of infrastructure processes and best practices for internal use and onboarding.
Qualifications:
- Experience: 3 years of experience in DevOps or a related role with exposure to ML Ops workflows.
- Technical Skills:
- Expertise in AWS services (e.g. EC2 S3 Lambda EKS SageMaker).
- Proficiency in Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
- Handson experience with CI/CD tools like GitHub Actions or GitLab CI/CD.
- Strong skills in containerization Docker and orchestration Kubernetes.
- Proficiency in scripting languages such as Python Bash or PowerShell.
- ML Ops Knowledge: Familiarity with SageMaker Kubeflow MLflow or equivalent tools for machine learning operations.
- Monitoring Tools: Experience with observability tools like CloudWatch Prometheus Grafana or similar.
- ProblemSolving: Strong troubleshooting skills for cloud and systemrelated issues.
- Communication: Clear and effective communication skills to collaborate across technical and nontechnical teams.
NicetoHave:
- Experience managing data engineering workflows or working with platforms like Databricks.
- Knowledge of serverless architecture and eventdriven systems.
- Familiarity with cloud cost management tools and strategies.
- Exposure to advanced security compliance frameworks and practices.
- Familiarity with Ruby on Rails
Apply Now:
Join a team that values diversity and inclusivity. Thrive Career Wellness is proud to be an Equal Opportunity Employer. Should you require accommodation during the hiring process please let us know. Applicants must be legally entitled to work in Canada.
Step into a role that empowers you to be at the forefront of career wellness innovation. Apply today and join us in making careers thrive!