Job Title: LLMOps Engineer
Location: Hybrid - Toronto Canada.
Salary Range: $140K - $160K
Reports To: Head of AI with collaboration across Engineering & DevOps
Job Summary:
We are seeking an experienced and highly skilled LLMOps Engineer to join our team at Thrive. This newly created role will be responsible for deploying optimizing and scaling large language model (LLM) applications across our platform. The successful candidate will own the operational backbone of our AI-driven products ensuring performance reliability and cost-efficiency while collaborating closely with our AI and engineering teams.
If you are someone who thrives in fast-paced environments enjoys building scalable AI infrastructure and is excited about shaping the future of LLM capabilities at Thrive this is the role for you.
Key Responsibilities:
- Lead LLM infrastructure efforts across multiple engineering teams ensuring scalable secure and efficient delivery of AI-powered features.
- Design build and maintain production-grade systems for deploying and managing LLMs including versioning A/B testing and rollback strategies.
- Collaborate with the AI team to implement prompt management systems prompt versioning and token optimization strategies.
- Monitor and optimize inference latency throughput caching strategies and multi-provider cost management (OpenAI Anthropic AWS Bedrock etc.).
- Develop observability pipelines including quality metrics evaluation workflows error monitoring and user feedback loops.
- Implement and maintain Retrieval-Augmented Generation (RAG) systems embedding pipelines and vector database operations.
- Support fine-tuning workflows and manage model registries for both proprietary and open-source models.
- Implement AI safety guardrails content filtering and compliance measures to ensure responsible deployment.
- Support general DevOps initiatives 10% of the time including CI/CD improvements and cloud infrastructure updates.
- Maintain thorough documentation of all LLM infrastructure processes and best practices.
Business Problem the LLMOps Engineer Will Solve:
This role will serve as the foundation of Thrives AI infrastructure ensuring our LLM-powered features are reliable cost-effective and scalable. By establishing strong operational systems and evaluation pipelines the LLMOps Engineer will directly accelerate Thrives ability to deliver meaningful AI-driven career solutions for our customers.
Ideal Candidate Demographics:
- 3 years of experience in LLMOps MLOps or similar production-focused AI/ML roles.
- Strong Python programming skills and familiarity with LLM libraries and frameworks.
- Hands-on experience with LLM providers (OpenAI Anthropic AWS Bedrock Azure Vertex Databricks).
- Experience with vector databases such as Pinecone Weaviate Qdrant or Chroma.
- Knowledge of model serving tools (vLLM TGI Ray Serve).
- Proficiency with Docker Kubernetes and cloud environments (AWS preferred).
- Familiarity with prompt engineering token optimization chain-of-thought approaches and evaluation metrics.
- Experience with LLM-specific tooling (LangSmith Weights & Biases Phoenix MLflow).
- Ability to troubleshoot LLM issues such as latency improvements hallucination mitigation and context window strategies.
- Strong communication skills with both technical and non-technical stakeholders.
Nice-to-Have:
- Experience with open-source LLMs (Llama Mistral etc.).
- Knowledge of advanced RAG techniques including hybrid search and re-ranking.
- Exposure to agent frameworks and real-time LLM applications.
- Background in traditional MLOps data engineering or multimodal models.
- Experience with Ruby on Rails.
- Understanding of AI safety and alignment principles.
Our Hiring Process:
Talent Acquisition Screening 30 minutes
Take Home Technical 3 days to complete
Meet Ali (Hiring Manager) - 30 -45 minutes
Live PR with our Staff Engineer - 1 Hour
Meet The Leaders
Life at Thrive:
- Fast-paced high-trust environment with significant ownership.
- Opportunity to shape the foundation of Thrives AI infrastructure from day one.
- Strong career progression and mentorship opportunities.
Total Rewards Package:
- 3 weeks paid vacation 1-week holiday shutdown
- Health insurance & wellness coverage
- Yearly Learning & Development Allowance
- Yearly Workspace Allowance
At Thrive we understand and value diversity in our employees and are proud to be an Equal Opportunity Employer. If you require accommodation at any time during the recruitment process please let us know.
Only those who are legally entitled to work in Canada will be considered for interview and employment.