Principal AI Data Architect

Not Interested
Bookmark
Report This Job

profile Job Location:

Atlanta, GA - USA

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

Principal AI Data Architect

AI-Ready Data Platform ML/LLMOps Agentic AI Infrastructure Governance & Security
Location: REMOTE
Department: Data & AI Engineering

About the Role

We are hiring a Principal AI Data Architect - a hands-on senior individual contributor responsible for designing building governing and evolving the single source of truth that powers all AI initiatives across the organization.

This platform will serve as the foundational backbone for:

  • Conversational AI assistants
  • Dashboard intelligence
  • Autonomous AI agents
  • RAG-powered applications
  • Predictive ML models

You will:

  • Architect and implement the platform
  • Define and enforce data contracts
  • Govern access for both humans and AI agents
  • Ensure accuracy reliability and traceability of AI outputs

Key Responsibilities

1. AI-Ready Data Platform (Single Source of Truth)

  • Architect and own the enterprise AI data platform
  • Design multi-domain data models (lakehouse data mesh event-driven)
  • Own full data stack:
    • Streaming: Kafka Spark Structured Streaming
    • Batch: Databricks PySpark Delta Lake
    • Cloud: AWS Azure
  • Eliminate data silos and ensure a unified data layer
  • Modernize legacy ETL and DWH systems to cloud-native architectures

2. Semantic Models & Knowledge Layer

  • Design semantic layer with:
    • Ontologies taxonomies entity relationships
  • Build and maintain knowledge graphs (e.g. Neo4j)
  • Define feature store and semantic data contracts
  • Ensure metadata management lineage and auditability

3. RAG Vector & Retrieval Infrastructure

  • Design embedding pipelines and vector stores:
    • Pinecone FAISS ChromaDB OpenSearch
  • Define retrieval data contracts for AI systems
  • Optimize for:
    • Precision recall latency and cost

4. ML/LLMOps Infrastructure

  • Build ML and LLMOps pipelines:
    • Training data pipelines
    • Feature engineering
    • Model registry (MLflow)
  • Implement CI/CD for AI systems:
    • Validation deployment rollback monitoring
  • Support LLM fine-tuning workflows:
    • RLHF pipelines
    • Data curation and filtering
  • Establish best practices:
    • Versioning A/B testing canary releases

5. Multi-Consumer AI Serving Architecture

Design data services for:

  • Conversational AI
    • Low-latency APIs for chatbots and copilots
  • BI & Dashboard Assistants
    • Semantic query layer and text-to-SQL
  • Autonomous AI Agents
    • Tool APIs memory/state management
  • Predictive ML Models
    • Feature pipelines and real-time serving
  • AI Experimentation
    • Secure sandbox environments

6. Governance Security & Access Control

  • Implement RBAC and attribute-based access controls
  • Enforce agent-specific permissions
  • Ensure:
    • PII masking
    • Encryption
    • Audit logging
    • Compliance (SOX GDPR SOC2 AML/KYC)
  • Define schema governance and versioning
  • Maintain audit trails and data provenance

7. Agentic Observability & Output Accuracy

  • Build observability for AI agents:
    • Inputs outputs reasoning traces
  • Define evaluation metrics:
    • Accuracy hallucination rate relevance
  • Create feedback loops to improve data quality
  • Define SLAs for data freshness and AI accuracy
  • Implement human-in-the-loop workflows

8. Architecture Standards & Enablement

  • Define reference architecture and standards
  • Establish:
    • Testing frameworks
    • CI/CD pipelines
    • Infrastructure-as-Code (Terraform)
  • Lead design reviews and architecture governance
  • Conduct internal workshops and enablement sessions

Required Qualifications

Experience

  • 15 years in data engineering/architecture
  • 3 5 years in AI/ML/LLM data platforms
  • Experience designing enterprise-scale AI platforms
  • Strong background in regulated industries

Technical Skills

Expert:

  • Python SQL PySpark
  • Kafka Databricks Delta Lake Snowflake
  • AWS (S3 Glue EKS Bedrock Kinesis Redshift)
  • Docker Kubernetes Terraform CI/CD

Strong:

  • LangChain LlamaIndex
  • LLM APIs (OpenAI Bedrock Claude HuggingFace)
  • Vector DBs (Pinecone FAISS ChromaDB OpenSearch)
  • Knowledge graphs (Neo4j)

Working Knowledge:

  • MLflow FastAPI
  • Observability tools (Grafana CloudWatch)
  • Data lineage and metadata tools

Preferred Qualifications

  • Degree in Computer Science or related field
  • Experience in presales / solution architecture
  • Background in financial services SaaS or regulated industries
  • Familiarity with:
    • MCP (Model Context Protocol)
    • Agent frameworks (LangGraph AutoGen CrewAI)
  • Experience with AI observability systems

Success Metrics (First 12 Months)

  • AI platform architecture adopted within 120 days
  • Platform serving 3 AI consumer types
  • Governance framework fully implemented
  • Observability and evaluation systems operational
  • Measurable improvement in AI output accuracy
  • Legacy modernization delivering cost and speed benefits
  • Engineering standards adopted organization-wide

Technology Stack

Databricks Delta Lake PySpark Kafka Snowflake AWS Azure Kubernetes Docker Terraform MLflow LangChain LlamaIndex OpenAI Bedrock Claude Pinecone FAISS ChromaDB OpenSearch Neo4j FastAPI Python SQL MCP LangGraph CI/CD Grafana CloudWatch

Principal AI Data Architect AI-Ready Data Platform ML/LLMOps Agentic AI Infrastructure Governance & Security Location: REMOTE Department: Data & AI Engineering About the Role We are hiring a Principal AI Data Architect - a hands-on senior individual contributor responsible for designing b...
View more view more