Role: AI Architect
Location: Charlotte NC Dallas TX Iselin NJ (Onsite)
Type: Contract
- We are seeking a Principal GenAI Architect to serve as a hands-on practitioner and core technical visionary. This is a rare high-impact role requiring deep expertise in Generative AI distributed systems and agentic architectures. You will act as the central design authority for our GenAI capabilities within a matrixed organization bridging internal platform development third-party vendor reviews and cutting-edge agentic workflows.
- Your primary mandate is to push the thinking-elevating our AI strategy while remaining deeply hands-on. You will oversee all GenAI use cases driving architectural excellence across cloud on-premise and edge environments with a specific focus on applications within the regional banking and financial services sector.
Key Responsibilities:
- GenAI Architecture & Thought Leadership:
- Serve as the ultimate technical authority for GenAI architecture across the enterprise reviewing and guiding all AI/ML use cases within a matrixed organization.
- Push the boundaries of our technical vision acting as a forward-thinking catalyst for how GenAI is built and deployed.
- Lead the architectural review process for all third-party AI integrations coming into the bank (e.g. ServiceNow Five9 Pega) ensuring they meet strict security performance and integration standards.
Agentic Stack & AI Platform Engineering:
- Spearhead the growth and development of our agentic stack designing agentic frameworks that incorporate robust workflow (WF) logic.
- Architect sophisticated retrieval systems and agent data stacks utilizing vector databases hybrid search BM25 and graph-based reasoning.
- Implement solutions for externalized long-term memory contextual data freshness and Model Context Protocol (MCP) servers.
- Lead prompt and context engineering strategies to maximize model accuracy and reliability.
- Infrastructure Inference & Edge Computing:
- Design implement and scale high-performance distributed systems and AI/ML platforms.
- Optimize LLM inference implementing advanced batching caching strategies and load balancing techniques.
- Evaluate and implement dynamic deployment strategies weighing the trade-offs of deploying small/local LLMs at the edge versus leveraging hyperscaler inferencing via cloud APIs.
- Architect and test distributed API gateways across hybrid (cloud and on-premise) environments.
- Oversee on-premise hardware strategy including rigorous GPU management utilization and thermal/compute optimization.
Minimum Skills Required: Required Qualifications
- Engineering Foundation: 12-15 years experience with strong proficiency in at least one core programming language (e.g. Python Go C) and deep experience building large-scale distributed systems.
- GenAI & LLM Expertise: 5-7 years hands-on practitioner-level experience with LLM inference optimization fine-tuning and deployment strategies.
- Agentic Architectures: 3-5 years experience with a proven track record of building complex agentic systems evaluation frameworks and advanced retrieval pipelines (RAG Vector DBs Graph reasoning).
- Cloud & Infrastructure: 10-12 years extensive experience with Kubernetes Cloud Infrastructure (AWS GCP or Azure) and managing high-availability platforms.
- Hardware / On-Premise Knowledge: 8-10 years experience and understanding of GPU orchestration resource management and hardware optimization in on-premise or hybrid data centers.
- Strategic Communication: 12-15 years experience and ability to navigate a matrixed organization translate complex technical trade-offs to leadership and rigorously evaluate third-party enterprise platforms.
Nice to Have
- Domain experience in the Banking or Financial Services industry.
- Interest or hands-on experience in integrating Blockchain technologies and decentralized frameworks.
Role: AI Architect Location: Charlotte NC Dallas TX Iselin NJ (Onsite) Type: Contract We are seeking a Principal GenAI Architect to serve as a hands-on practitioner and core technical visionary. This is a rare high-impact role requiring deep expertise in Generative AI distributed systems and agent...
Role: AI Architect
Location: Charlotte NC Dallas TX Iselin NJ (Onsite)
Type: Contract
- We are seeking a Principal GenAI Architect to serve as a hands-on practitioner and core technical visionary. This is a rare high-impact role requiring deep expertise in Generative AI distributed systems and agentic architectures. You will act as the central design authority for our GenAI capabilities within a matrixed organization bridging internal platform development third-party vendor reviews and cutting-edge agentic workflows.
- Your primary mandate is to push the thinking-elevating our AI strategy while remaining deeply hands-on. You will oversee all GenAI use cases driving architectural excellence across cloud on-premise and edge environments with a specific focus on applications within the regional banking and financial services sector.
Key Responsibilities:
- GenAI Architecture & Thought Leadership:
- Serve as the ultimate technical authority for GenAI architecture across the enterprise reviewing and guiding all AI/ML use cases within a matrixed organization.
- Push the boundaries of our technical vision acting as a forward-thinking catalyst for how GenAI is built and deployed.
- Lead the architectural review process for all third-party AI integrations coming into the bank (e.g. ServiceNow Five9 Pega) ensuring they meet strict security performance and integration standards.
Agentic Stack & AI Platform Engineering:
- Spearhead the growth and development of our agentic stack designing agentic frameworks that incorporate robust workflow (WF) logic.
- Architect sophisticated retrieval systems and agent data stacks utilizing vector databases hybrid search BM25 and graph-based reasoning.
- Implement solutions for externalized long-term memory contextual data freshness and Model Context Protocol (MCP) servers.
- Lead prompt and context engineering strategies to maximize model accuracy and reliability.
- Infrastructure Inference & Edge Computing:
- Design implement and scale high-performance distributed systems and AI/ML platforms.
- Optimize LLM inference implementing advanced batching caching strategies and load balancing techniques.
- Evaluate and implement dynamic deployment strategies weighing the trade-offs of deploying small/local LLMs at the edge versus leveraging hyperscaler inferencing via cloud APIs.
- Architect and test distributed API gateways across hybrid (cloud and on-premise) environments.
- Oversee on-premise hardware strategy including rigorous GPU management utilization and thermal/compute optimization.
Minimum Skills Required: Required Qualifications
- Engineering Foundation: 12-15 years experience with strong proficiency in at least one core programming language (e.g. Python Go C) and deep experience building large-scale distributed systems.
- GenAI & LLM Expertise: 5-7 years hands-on practitioner-level experience with LLM inference optimization fine-tuning and deployment strategies.
- Agentic Architectures: 3-5 years experience with a proven track record of building complex agentic systems evaluation frameworks and advanced retrieval pipelines (RAG Vector DBs Graph reasoning).
- Cloud & Infrastructure: 10-12 years extensive experience with Kubernetes Cloud Infrastructure (AWS GCP or Azure) and managing high-availability platforms.
- Hardware / On-Premise Knowledge: 8-10 years experience and understanding of GPU orchestration resource management and hardware optimization in on-premise or hybrid data centers.
- Strategic Communication: 12-15 years experience and ability to navigate a matrixed organization translate complex technical trade-offs to leadership and rigorously evaluate third-party enterprise platforms.
Nice to Have
- Domain experience in the Banking or Financial Services industry.
- Interest or hands-on experience in integrating Blockchain technologies and decentralized frameworks.
View more
View less