Staff Developer, AI Evaluation & Reliability

Caseware

Not Interested
Bookmark
Report This Job

profile Job Location:

Medellín - Colombia

profile Monthly Salary: Not Disclosed
Posted on: 14 hours ago
Vacancies: 1 Vacancy

Job Summary

Caseware is one of Canadas original Fintech companies having led the global audit and accounting software industry for over 30 years with more than 500000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36000 accounting and audit professionals list Caseware as a skill on their LinkedIn profiles!

As we build the next generation of intelligent cloud-based solutions for auditors accountants and financial professionals agentic AI is a core pillar of our strategy. We are developing a reusable enterprise-grade agentic AI platform that enables product teams acrossCasewareCloud tosafely consistently and efficiently deliver AI-powered capabilitiesin highly regulated environments.
We are looking for a Staff Developer AI Evaluation & Reliability to raise the bar on the quality trustworthiness and operational reliability of our AI platform. This is a senior individual contributor role with broad technical influence and leadership expectations. You will own how our agentic systems are evaluatedvalidated and governed in production and help define the standards that product teams acrossCasewarerely on.
In this role you will provide technical stewardship for evaluation frameworks reliability mechanisms and compliance-aligned controls that sit at the center ofCasewaresAI closely with Staff Engineers Product Management QA Security Data and Infrastructure teams to ensure the platform scales reliably meets enterprise and regulatory standards and delivers measurable value to both product teams and customers.

Location: This is a fully remote position located in Colombia.

Contact

What you will be doing

    • Own and evolveevaluation strategyfor LLM- and agent-based systems including golden datasets rubric-based scoring reference-free evaluations regression testing and A/B experimentation.
    • Benchmark and analyzefoundation model performancewithinCasewaresdomainidentifyingcapability gaps failure modes and opportunities for improvement.
    • Lead the design and optimization ofRetrieval-Augmented Generation (RAG)pipelines including embeddings retrieval strategies reranking and retrieval quality metrics.
    • Design andmaintainfeedback andevaluationpipelinesthat connect real-world user behavior to measurable improvements in agent performance.
    • Apply data science techniques to analyze agent behavior diagnose reliability issues detect drift and surface systemic risks.
    • Define and implementguardrailsfor agentic systems including schema validation content filtering tool governance and policy enforcement.
    • Establishapproval gates audit trails and controlled rollout mechanismsfor AI and agent changes including feature flags staged deployments and kill switches.
    • Partner with Security and Data teams to embedprivacy-by-designpractices including PII detection and masking data minimization and retention controls.
    • Support and influenceSOC 2 and ISO 27001-aligned controlsacross AI data flows including access management logging and incident response.
    • Act as aStaff-level technical leader mentoring other engineers shaping best practices and raising the overall bar for AI reliability and evaluation across the organization.

What youll bring

    • Strongdata science foundation including Python SQL statistics and experiment design.
    • Deep hands-on experience withLLMs prompting strategies and agent reasoning patterns.
    • Practicalexpertisewithembeddings vector databases retrieval metrics and reranking approaches.
    • Proven experience designing or operatingevaluation frameworks for generative AI or agentic systems including automated and human-in-the-loop evaluation.
    • Strong understanding ofAI reliability safety and governance including guardrails validation monitoring and change control.
    • Working knowledge ofprivacy engineering principlesand familiarity with GDPR/CCPA concepts such as consent purpose limitation and data subject rights.
    • Experienceoperatinginenterprise or regulated environments including contributions to SOC 2 / ISO 27001-aligned systems and processes.
    • Ability to influence across teams communicate clearly about complex AI trade-offs and drive alignment without direct authority.
    • Strong English language communication and collaboration skills

Nice to have

    • Experience with agent frameworks such asLangChainor similar.
    • Domain experience infinance accounting or other regulated industries (e.g. healthcare legal).
    • Experience withAI safety or red-teaming including prompt injection data exfiltration or tool misuse.
    • Familiarity withgoverned change management including feature flags staged rollouts and kill switches.
    • Experience withagentic codingor autonomous development workflows.

Technology stack your team works with

    • Backend & Platform: TypeScript NestJS Python
    • Cloud & Infrastructure: AWS EKS AWS Lambda AWS Bedrock AWS AgentCore
    • Search & Retrieval: AWS OpenSearch Serverless
    • Document & Data Processing: AWS Textract DynamoDB S3
    • AI Evaluation & Observability: LangFuse LangSmith (or equivalent)
    • AI-assisted development tools: GitHub Copilot AWS Kiro
    • Developer Tooling: GitHub GitHub Actions Nx Monorepo
    • Collaboration: Jira Confluence Microsoft Teams Outlook

Perks & Benefits

    • Contrato a termino Indefinido with all the legal benefits
    • Prepaid Medicine
    • Life insurance and funeral assistance
    • Internet allowance
    • Home office stipend
    • Competitive compensation above the market average
    • 100% remote work environment and an excellent work-life balance
    • Opportunity to work for a growing global SaaS leader company
    • A culture that promotes independence innovation trust and accountability
    • Open space to be creative innovative and strategize for the future
    • Mentorship by highly experienced professional
    • Budget for training we want you to grow
    • 5 Personal Time Off days per year
    • Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
    • Recognition Award additional paid time off in recognition of the corresponding year of service
    • Upgrade vacation starting at 5 years of service
Whats in it for you:

Innovation is at our core. We work with cutting-edge technology in accounting and financial reporting constantly pushing the boundaries to create impactful software solutions.
We are committed to a collaborative culture where your ideas are valued and knowledge sharing is encouraged within a supportive inclusive team.
Work-life balance is important to us. We offer flexible work options remote opportunities and generous time-off policies to ensure a healthy work-life balance.
We offer competitive compensation including a competitive salary and comprehensive benefits such as health insurance and retirement plans.
We are driven by impactful work. Your contributions directly affect how our clients manage financial processes and drive their success.
Recognition and rewards matter to us. We celebrate hard work through recognition programs performance bonuses and opportunities for career growth.
We embrace global opportunities. Work on international projects and collaborate with a diverse global team.

About Caseware:
Casewares cutting-edge software products are meticulously designed for accounting firms corporations and teams are continually collaborating innovating and building upon our existing suite of products. With a customer-focused mindset we are building technology that is shaping what the future of audits financial reporting and financial data analytics will look like.

With a recent strategic investment from Hg Capital in 2020 Caseware is now in its next major growth phase as we double down on the people and products that have made Caseware so successful to date.

One of Casewares core values is Many Voices One Team and with that in mind were dedicated to building teams as diverse as our customers in an equitable and inclusive way. We welcome and encourage candidates of all backgrounds to apply. Should you require accommodations or have any questions at any point during the application or interview process please e-mail our People Operations team at emailprotected.

Background Check:
Any candidates successful in obtaining an offer for a position will need to successfully complete a background check through which typically includes an Identity Verification and Criminal Record Check. Executives and Senior Managers will undergo a Soft Credit Check as well. Candidates residingin the Netherlands and Germany are excluded from undergoing background checks via

Security and Fraud:
Caseware takes the security of candidates seriously. All legitimate communication from us will come from email addresses ending in @ and our open positions are always listed on reputable job boards and on our website We will NEVER ask for payment or financial information from you. If you receive an unsolicited job offer proceed with extreme caution.

Required Experience:

Staff IC

Caseware is one of Canadas original Fintech companies having led the global audit and accounting software industry for over 30 years with more than 500000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36000 accounting and audit pr...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Accounting, audit, analytics and compliance software built by seasoned accountants. Manage your audit and financial reporting more efficiently with less risk.

View Profile View Profile