Principal AI Data Architect

American IT Systems

Job Location:

Atlanta, GA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Principal AI Data Architect

AI-Ready Data Platform ML/LLMOps Agentic AI Infrastructure Governance & Security
Location: REMOTE
Department: Data & AI Engineering

About the Role

We are hiring a Principal AI Data Architect - a hands-on senior individual contributor responsible for designing building governing and evolving the single source of truth that powers all AI initiatives across the organization.

This platform will serve as the foundational backbone for:

Conversational AI assistants
Dashboard intelligence
Autonomous AI agents
RAG-powered applications
Predictive ML models

You will:

Architect and implement the platform
Define and enforce data contracts
Govern access for both humans and AI agents
Ensure accuracy reliability and traceability of AI outputs

Key Responsibilities

1. AI-Ready Data Platform (Single Source of Truth)

Architect and own the enterprise AI data platform
Design multi-domain data models (lakehouse data mesh event-driven)
Own full data stack:
- Streaming: Kafka Spark Structured Streaming
- Batch: Databricks PySpark Delta Lake
- Cloud: AWS Azure
Eliminate data silos and ensure a unified data layer
Modernize legacy ETL and DWH systems to cloud-native architectures

2. Semantic Models & Knowledge Layer

Design semantic layer with:
- Ontologies taxonomies entity relationships
Build and maintain knowledge graphs (e.g. Neo4j)
Define feature store and semantic data contracts
Ensure metadata management lineage and auditability

3. RAG Vector & Retrieval Infrastructure

Design embedding pipelines and vector stores:
- Pinecone FAISS ChromaDB OpenSearch
Define retrieval data contracts for AI systems
Optimize for:
- Precision recall latency and cost

4. ML/LLMOps Infrastructure

Build ML and LLMOps pipelines:
- Training data pipelines
- Feature engineering
- Model registry (MLflow)
Implement CI/CD for AI systems:
- Validation deployment rollback monitoring
Support LLM fine-tuning workflows:
- RLHF pipelines
- Data curation and filtering
Establish best practices:
- Versioning A/B testing canary releases

5. Multi-Consumer AI Serving Architecture

Design data services for:

Conversational AI
- Low-latency APIs for chatbots and copilots
BI & Dashboard Assistants
- Semantic query layer and text-to-SQL
Autonomous AI Agents
- Tool APIs memory/state management
Predictive ML Models
- Feature pipelines and real-time serving
AI Experimentation
- Secure sandbox environments

6. Governance Security & Access Control

Implement RBAC and attribute-based access controls
Enforce agent-specific permissions
Ensure:
- PII masking
- Encryption
- Audit logging
- Compliance (SOX GDPR SOC2 AML/KYC)
Define schema governance and versioning
Maintain audit trails and data provenance

7. Agentic Observability & Output Accuracy

Build observability for AI agents:
- Inputs outputs reasoning traces
Define evaluation metrics:
- Accuracy hallucination rate relevance
Create feedback loops to improve data quality
Define SLAs for data freshness and AI accuracy
Implement human-in-the-loop workflows

8. Architecture Standards & Enablement

Define reference architecture and standards
Establish:
- Testing frameworks
- CI/CD pipelines
- Infrastructure-as-Code (Terraform)
Lead design reviews and architecture governance
Conduct internal workshops and enablement sessions

Required Qualifications

Experience

15 years in data engineering/architecture
3 5 years in AI/ML/LLM data platforms
Experience designing enterprise-scale AI platforms
Strong background in regulated industries

Technical Skills

Expert:

Python SQL PySpark
Kafka Databricks Delta Lake Snowflake
AWS (S3 Glue EKS Bedrock Kinesis Redshift)
Docker Kubernetes Terraform CI/CD

Strong:

LangChain LlamaIndex
LLM APIs (OpenAI Bedrock Claude HuggingFace)
Vector DBs (Pinecone FAISS ChromaDB OpenSearch)
Knowledge graphs (Neo4j)

Working Knowledge:

MLflow FastAPI
Observability tools (Grafana CloudWatch)
Data lineage and metadata tools

Preferred Qualifications

Degree in Computer Science or related field
Experience in presales / solution architecture
Background in financial services SaaS or regulated industries
Familiarity with:
- MCP (Model Context Protocol)
- Agent frameworks (LangGraph AutoGen CrewAI)
Experience with AI observability systems

Success Metrics (First 12 Months)

AI platform architecture adopted within 120 days
Platform serving 3 AI consumer types
Governance framework fully implemented
Observability and evaluation systems operational
Measurable improvement in AI output accuracy
Legacy modernization delivering cost and speed benefits
Engineering standards adopted organization-wide

Technology Stack

Databricks Delta Lake PySpark Kafka Snowflake AWS Azure Kubernetes Docker Terraform MLflow LangChain LlamaIndex OpenAI Bedrock Claude Pinecone FAISS ChromaDB OpenSearch Neo4j FastAPI Python SQL MCP LangGraph CI/CD Grafana CloudWatch

Principal AI Data Architect AI-Ready Data Platform ML/LLMOps Agentic AI Infrastructure Governance & Security Location: REMOTE Department: Data & AI Engineering About the Role We are hiring a Principal AI Data Architect - a hands-on senior individual contributor responsible for designing b...