AI Test Architect

Medellín - Colombia

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Caseware is one of Canadas original Fintech companies having led the global audit and accounting software industry for over 30 years with more than 500000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36000 accounting and audit professionals list Caseware as a skill on their LinkedIn profiles!

Why This Role MattersAs a leader in cloud-native SaaS we are accelerating our shift to an AI-first futureembedding generative AI and autonomous agents across our platform to deliver smarter faster user experiences. Weareon the lookout fora visionaryAI Test Architecttobuildthe next-generation Quality Intelligence platform: one thatleveragesgenerative AI for automated test creation self-healing execution predictive defect analytics and rigorous validation of ourAI featuresbuilt inhouse forourglobal audience.

As our foundationalAITest Architectyoulldesign scalable ethical frameworks that ensure reliability safety and compliance while accelerating release velocity (targeting 30-50% faster cycles through AI-augmented testing). Your work will reduce risk in production AI agents minimize hallucinations/bias/security exposures and empower the entire engineering organization to adopt AI-augmented quality practicesthat supplement traditional mature frameworks we have. This high-impact role sits at the intersection of Platform Engineering AI and Qualityshaping how we build trustworthy intelligence at scale.

Location: This is a fully remote position located in Colombia.

You will be reporting to:

Jai Joshi

Contact:

Maira Russo- Senior Talent Acquisition Partner

What Youll Be Doing

-Driven Quality Strategy & Architecture
Architect a comprehensive Quality Intelligence platform using generative AI to predict defect hotspots intelligentlyoptimizeregression suites auto-generate tests and enable self-healing automation.
Define enterprise-wide AI-first testing strategy including non-deterministic evaluation paradigms continuous monitoring for drift/hallucination and integration across the full SDLC.
Establish governance for ethical AI testing aligning with emerging standards

& Agent Evaluation Frameworks
Design and implement advanced benchmarks red teaming protocols and adversarial testing for internal AI agents and generative featuresfocusing on hallucination rates bias/fairness prompt injection jailbreaks and goal misalignment.
Build evaluation pipelines with statistical rigor (e.g. multi-trial runs LLM-as-judge human-in-the-loop) using tools likeLangFuseLangSmithDeepEval RAGAS orArizePhoenix for metrics such as faithfulness context precision and safety compliance.
Architect harnesses for agentic workflows tool-calling planning multi-agent simulations and post-deployment observability.

& Automation Architecture
Partner with DevOps to embed AI-based testing into GitHub-based CI/CD pipelines (e.g. AI-generated tests predictive flakiness detection automated gating with quality signals).
Lead design of self-healing test frameworks (integrating AI plugins with Playwright/Cypress or similar) that adapt to UI/model changes with minimal maintenance.
Architect synthetic data generationmaintaingoldendata-setsand AI-powered data masking solutions to enable privacy-compliant high-fidelity testing at scale.

-Functional Leadership & Evangelism
Collaborate with product data science ML engineering and security teams to influence AI feature design with quality guardrails from day one.
Evangelize and mentor: Upskill traditional QA engineers into AI-augmented testers through workshops playbooks and communities of practice.
Drive adoption of AI quality best practices organization-wide including metrics dashboards for DORA AI-specific indicators (e.g. hallucination rate red team success rate self-healing coverage).

Metrics & Continuous Evolution
Define and implement AI-specific quality telemetry (e.g. drift detection faithfulness scoring compliance incidents) integrated with tools like Langfuse.
Establish feedback loops for model iteration A/B testing guardrails and proactive risk mitigation in production.

Challenges Youll Architect Solutions For

Building reliable evaluation for non-deterministic agentic AI in a fast-moving SaaS landscape.
Scaling self-healing and generative test automation without introducing new flakiness or security debt.
Balancing innovation speed with rigorous red teaming and ethical safeguards for customer-facing AI.

Success in the First 6-12 Months

Launch the Quality Intelligence platform foundation with AI-augmented pipelines covering>70% of critical paths.
Establish red teaming/red-teaming-as-code processes that reduce high-severity AI risks by>40%.
Upskill>50% of QA/engineering teams on AI testing fundamentals and deliver measurable velocity/safety gains.
Accuracy Baseline:Establisha baseline 90% Faithfulness score for all RAG-powered features.

What You Will Bring

8 years in Quality Engineering/Test Architecture within cloud-native SaaS environments with 2 years focused on AI/ML/LLM testing and validation.
Deep expertise in AWS (serverless microservices IaC with Terraform/CloudFormation) and GitHub CI/CD ecosystems.
Proficiency architecting LLM-based applications and testing frameworks (LangChain/LangGraph/LangSmith strongly preferred; equivalents acceptable).
Mastery of modern automation (Playwright Cypress) with hands-on experience integrating self-healing AI plugins or generative test tools.
Strong programming skills in JavaScript/TypeScript and/or Python; solid understanding of foundational AI concepts (transformers embeddings RAG evaluation trade-offs).
Experience with LLM evaluation tools like Bedrock Evaluations Prompt Management Guardrails DeepEval RAGAS Arize Phoenix Langfuse.
Experience with Red teaming frameworks/tools (Cobalt Strike Sliver Nmap) and knowledge of adversarial testing methodologies is a bonus.
Proven leadership: Mentoring teams defining standards and driving cross-functional change in ambiguous high-growth settings.
Bachelors/Masters in Computer Science AI/ML or equivalent; relevant certifications a strong plus.
Strong English language communication and collaboration skills

Perks & Benefits

Contrato a termino Indefinido with all the legal benefits
Prepaid Medicine
Life insurance and funeral assistance
Internet allowance
Home office stipend
Competitive compensation above the market average
100% remote work environment and an excellent work-life balance
Opportunity to work for a growing global SaaS leader company
A culture that promotes independence innovation trust and accountability
Open space to be creative innovative and strategize for the future
Mentorship by a highly experienced professional
Budget for training we want you to grow
5 Personal Time Off days per year
Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
Recognition Award additional paid time off in recognition of the corresponding year of service
Upgrade vacation starting at 5 years of service

Whats in it for you:

Innovation is at our core. We work with cutting-edge technology in accounting and financial reporting constantly pushing the boundaries to create impactful software solutions.

We are committed to a collaborative culture where your ideas are valued and knowledge sharing is encouraged within a supportive inclusive team.

Work-life balance is important to us. We offer flexible work options remote opportunities and generous time-off policies to ensure a healthy work-life balance.

We offer competitive compensation including a competitive salary and comprehensive benefits such as health insurance and retirement plans.

We are driven by impactful work. Your contributions directly affect how our clients manage financial processes and drive their success.

Recognition and rewards matter to us. We celebrate hard work through recognition programs performance bonuses and opportunities for career growth.

We embrace global opportunities. Work on international projects and collaborate with a diverse global team.

About Caseware:

Casewares cutting-edge software products are meticulously designed for accounting firms corporations and teams are continually collaborating innovating and building upon our existing suite of products. With a customer-focused mindset we are building technology that is shaping what the future of audits financial reporting and financial data analytics will look like.

With a recent strategic investment from Hg Capital in 2020 Caseware is now in its next major growth phase as we double down on the people and products that have made Caseware so successful to date.

One of Casewares core values is Many Voices One Team and with that in mind were dedicated to building teams as diverse as our customers in an equitable and inclusive way. We welcome and encourage candidates of all backgrounds to apply. Should you require accommodations or have any questions at any point during the application or interview process please e-mail our People Operations team at emailprotected.

Background Check:

Any candidates successful in obtaining an offer for a position will need to successfully complete a background check through which typically includes an Identity Verification and Criminal Record Check. Executives and Senior Managers will undergo a Soft Credit Check as well. Candidates residingin the Netherlands and Germany are excluded from undergoing background checks via

Security and Fraud:

Caseware takes the security of candidates seriously. All legitimate communication from us will come from email addresses ending in @ and our open positions are always listed on reputable job boards and on our website We will NEVER ask for payment or financial information from you. If you receive an unsolicited job offer proceed with extreme caution.

Required Experience:

Staff IC

Location: This is a fully remote position located in Colombia.

You will be reporting to:

Jai Joshi

Contact:

Maira Russo- Senior Talent Acquisition Partner

What Youll Be Doing

-Driven Quality Strategy & Architecture
Architect a comprehensive Quality Intelligence platform using generative AI to predict defect hotspots intelligentlyoptimizeregression suites auto-generate tests and enable self-healing automation.
Define enterprise-wide AI-first testing strategy including non-deterministic evaluation paradigms continuous monitoring for drift/hallucination and integration across the full SDLC.
Establish governance for ethical AI testing aligning with emerging standards

& Agent Evaluation Frameworks
Design and implement advanced benchmarks red teaming protocols and adversarial testing for internal AI agents and generative featuresfocusing on hallucination rates bias/fairness prompt injection jailbreaks and goal misalignment.
Build evaluation pipelines with statistical rigor (e.g. multi-trial runs LLM-as-judge human-in-the-loop) using tools likeLangFuseLangSmithDeepEval RAGAS orArizePhoenix for metrics such as faithfulness context precision and safety compliance.
Architect harnesses for agentic workflows tool-calling planning multi-agent simulations and post-deployment observability.

& Automation Architecture
Partner with DevOps to embed AI-based testing into GitHub-based CI/CD pipelines (e.g. AI-generated tests predictive flakiness detection automated gating with quality signals).
Lead design of self-healing test frameworks (integrating AI plugins with Playwright/Cypress or similar) that adapt to UI/model changes with minimal maintenance.
Architect synthetic data generationmaintaingoldendata-setsand AI-powered data masking solutions to enable privacy-compliant high-fidelity testing at scale.

-Functional Leadership & Evangelism
Collaborate with product data science ML engineering and security teams to influence AI feature design with quality guardrails from day one.
Evangelize and mentor: Upskill traditional QA engineers into AI-augmented testers through workshops playbooks and communities of practice.
Drive adoption of AI quality best practices organization-wide including metrics dashboards for DORA AI-specific indicators (e.g. hallucination rate red team success rate self-healing coverage).

Metrics & Continuous Evolution
Define and implement AI-specific quality telemetry (e.g. drift detection faithfulness scoring compliance incidents) integrated with tools like Langfuse.
Establish feedback loops for model iteration A/B testing guardrails and proactive risk mitigation in production.

Challenges Youll Architect Solutions For

Building reliable evaluation for non-deterministic agentic AI in a fast-moving SaaS landscape.
Scaling self-healing and generative test automation without introducing new flakiness or security debt.
Balancing innovation speed with rigorous red teaming and ethical safeguards for customer-facing AI.

Success in the First 6-12 Months

Launch the Quality Intelligence platform foundation with AI-augmented pipelines covering>70% of critical paths.
Establish red teaming/red-teaming-as-code processes that reduce high-severity AI risks by>40%.
Upskill>50% of QA/engineering teams on AI testing fundamentals and deliver measurable velocity/safety gains.
Accuracy Baseline:Establisha baseline 90% Faithfulness score for all RAG-powered features.

What You Will Bring

8 years in Quality Engineering/Test Architecture within cloud-native SaaS environments with 2 years focused on AI/ML/LLM testing and validation.
Deep expertise in AWS (serverless microservices IaC with Terraform/CloudFormation) and GitHub CI/CD ecosystems.
Proficiency architecting LLM-based applications and testing frameworks (LangChain/LangGraph/LangSmith strongly preferred; equivalents acceptable).
Mastery of modern automation (Playwright Cypress) with hands-on experience integrating self-healing AI plugins or generative test tools.
Strong programming skills in JavaScript/TypeScript and/or Python; solid understanding of foundational AI concepts (transformers embeddings RAG evaluation trade-offs).
Experience with LLM evaluation tools like Bedrock Evaluations Prompt Management Guardrails DeepEval RAGAS Arize Phoenix Langfuse.
Experience with Red teaming frameworks/tools (Cobalt Strike Sliver Nmap) and knowledge of adversarial testing methodologies is a bonus.
Proven leadership: Mentoring teams defining standards and driving cross-functional change in ambiguous high-growth settings.
Bachelors/Masters in Computer Science AI/ML or equivalent; relevant certifications a strong plus.
Strong English language communication and collaboration skills

Perks & Benefits

Contrato a termino Indefinido with all the legal benefits
Prepaid Medicine
Life insurance and funeral assistance
Internet allowance
Home office stipend
Competitive compensation above the market average
100% remote work environment and an excellent work-life balance
Opportunity to work for a growing global SaaS leader company
A culture that promotes independence innovation trust and accountability
Open space to be creative innovative and strategize for the future
Mentorship by a highly experienced professional
Budget for training we want you to grow
5 Personal Time Off days per year
Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
Recognition Award additional paid time off in recognition of the corresponding year of service
Upgrade vacation starting at 5 years of service

Whats in it for you:

Innovation is at our core. We work with cutting-edge technology in accounting and financial reporting constantly pushing the boundaries to create impactful software solutions.

We are committed to a collaborative culture where your ideas are valued and knowledge sharing is encouraged within a supportive inclusive team.

Work-life balance is important to us. We offer flexible work options remote opportunities and generous time-off policies to ensure a healthy work-life balance.

We offer competitive compensation including a competitive salary and comprehensive benefits such as health insurance and retirement plans.

We are driven by impactful work. Your contributions directly affect how our clients manage financial processes and drive their success.

Recognition and rewards matter to us. We celebrate hard work through recognition programs performance bonuses and opportunities for career growth.

We embrace global opportunities. Work on international projects and collaborate with a diverse global team.

About Caseware:

With a recent strategic investment from Hg Capital in 2020 Caseware is now in its next major growth phase as we double down on the people and products that have made Caseware so successful to date.

Background Check:

Security and Fraud:

Required Experience:

Staff IC

Key Skills

APIs
Pegasystems
Spring
SOAP
.NET
Hybris
Solution Architecture
Service-Oriented Architecture
Adobe Experience Manager
J2EE
Java
Oracle

Apply Now

About Company

Caseware

Accounting, audit, analytics and compliance software built by seasoned accountants. Manage your audit and financial reporting more efficiently with less risk.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

AI Test Architect

Medellín - Colombia

Job Summary

What Youll Be Doing

Challenges Youll Architect Solutions For

Success in the First 6-12 Months

What You Will Bring

Perks & Benefits

What Youll Be Doing

Challenges Youll Architect Solutions For

Success in the First 6-12 Months

What You Will Bring

Perks & Benefits

Key Skills

About Company

Related Jobs