Role : AI Platform Engineer (Guardrails Observability & Evaluation Infrastructure)
Location : Charlotte NC (100% onsite)
AI Platform Engineer to design and build the foundational components that power enterprise-scale GenAI
applications. This includes data guardrails model safety tooling observability pipelines evaluation harnesses and
standardized logging/monitoring frameworks. This role is critical for enabling safe reliable and compliant AI
development across multiple use cases teams and business units. Idea is to create the common platform services
that AI team will build upon. Key Responsibilities1. Guardrails Safety & Governance
Design and implement data guardrail frameworks (pre-processing redaction PII/PHI filtering DLP
integration prompt defenses).
Build Model Armor components such as:
Input validation & sanitization
Prompt-injection defenses
Harmful content detection & policy enforcement
Output filtering factchecking grounding checks
Integrate safety tooling (policy engines classifiers DLP APIs/safety models).
Collaborate with Security Compliance and Data Privacy teams to ensure frameworks meet enterprise
governance requirements.
2. Observability Frameworks
Build and maintain observability pipelines using tools like Arize AI (tracing quality metrics dataset
drift/hallucination tracking embedding monitoring).
Define and enforce platform-wide standards for:
Tracing LLM calls
Token usage and cost monitoring
Latency and reliability metrics
Prompt/model version tracking
Provide reusable SDKs or middleware for engineering teams to adopt observability with minimal friction.
3. Logging Monitoring & Telemetry
Design standardized LLM-specific logging schemas including:
Inputs/outputs
Model metadata
Retrieval metadata
Safety flags
User context and attribution
Build monitoring dashboards for performance cost anomalies errors and safety events.
Implement alerting and SLOs/SLIs for LLM inference systems.
4. Evaluation Infrastructure
Architect and maintain evaluation harnesses for GenAI systems including:
RAG evaluation (faithfulness relevance hallucination risk)
Summarization/QA evaluation
Human-in-the-loop review workflows
Automated eval pipelines integrated into CI/CD
Support frameworks such as RAGAS G-Eval rubric scoring pairwise comparisons and test case
generation.
Build reusable tooling for teams to write run and track model evaluations.
5. Platform Engineering & Reusable Components
Develop shared libraries APIs and services for:
Prompt management/versioning
Embedding pipelines and model wrappers
Retrieval adapters
Common data loaders and document preprocessing
Tool/function schemas
Drive consistency across teams with standards reference architectures and best practices.
Review system designs across use cases to ensure alignment to platform patterns.
6. Collaboration & Enablement
Partner with AI engineers product teams and data scientists to understand cross-cutting needs and convert
them into reusable platform features.
Create documentation onboarding guides examples and developer tooling.
Provide internal training (brown bags workshops) on guardrails observability and evaluation frameworks.
Required Qualifications Technical Skills
5-10 years software engineering or ML infrastructure experience.
Strong Python engineering fundamentals (FastAPI async typing/Pydantic testing).
Experience with model safety/guardrails approaches (prompt injection defense PII redaction toxicity filters policy enforcement).
Hands-on with Arize AI LangSmith or similar LLM observability platforms.
Experience creating evaluation frameworks using RAGAS G-Eval or custom rubric systems.
Strong familiarity with vector databases (Pinecone Weaviate Milvus) embeddings and retrieval pipelines.
Solid understanding of LLM architectures tokenization embeddings context limits and RAG patterns.
Experience in cloud (GCP preferred) Kubernetes/GE containers and CI/CD.
Strong understanding of security governance DLP data privacy RBAC and enterprise compliance requirements.
Soft Skills
Strong documentation and communication skills.
Ability to influence engineering teams and standardize best practices.
Comfortable working across multiple stakeholders-platform security ML engineering product.
Nice to Have
Experience with LangChain/LangGraph or Llamalndex orchestrations.
Experience with Rebuff Protect AI or similar LLM security tooling.
Experience with GCP Vertex AI pipelines Model Monitoring and Vector Search.
Familiarity with knowledge graphs grounding models fact-checking models.
Building SDKs or developer frameworks adopted across multiple teams.
On-prem or hybrid AI deployment experience.
Role : AI Platform Engineer (Guardrails Observability & Evaluation Infrastructure) Location : Charlotte NC (100% onsite) AI Platform Engineer to design and build the foundational components that power enterprise-scale GenAI applications. This includes data guardrails model safety tooling observab...
Role : AI Platform Engineer (Guardrails Observability & Evaluation Infrastructure)
Location : Charlotte NC (100% onsite)
AI Platform Engineer to design and build the foundational components that power enterprise-scale GenAI
applications. This includes data guardrails model safety tooling observability pipelines evaluation harnesses and
standardized logging/monitoring frameworks. This role is critical for enabling safe reliable and compliant AI
development across multiple use cases teams and business units. Idea is to create the common platform services
that AI team will build upon. Key Responsibilities1. Guardrails Safety & Governance
Design and implement data guardrail frameworks (pre-processing redaction PII/PHI filtering DLP
integration prompt defenses).
Build Model Armor components such as:
Input validation & sanitization
Prompt-injection defenses
Harmful content detection & policy enforcement
Output filtering factchecking grounding checks
Integrate safety tooling (policy engines classifiers DLP APIs/safety models).
Collaborate with Security Compliance and Data Privacy teams to ensure frameworks meet enterprise
governance requirements.
2. Observability Frameworks
Build and maintain observability pipelines using tools like Arize AI (tracing quality metrics dataset
drift/hallucination tracking embedding monitoring).
Define and enforce platform-wide standards for:
Tracing LLM calls
Token usage and cost monitoring
Latency and reliability metrics
Prompt/model version tracking
Provide reusable SDKs or middleware for engineering teams to adopt observability with minimal friction.
3. Logging Monitoring & Telemetry
Design standardized LLM-specific logging schemas including:
Inputs/outputs
Model metadata
Retrieval metadata
Safety flags
User context and attribution
Build monitoring dashboards for performance cost anomalies errors and safety events.
Implement alerting and SLOs/SLIs for LLM inference systems.
4. Evaluation Infrastructure
Architect and maintain evaluation harnesses for GenAI systems including:
RAG evaluation (faithfulness relevance hallucination risk)
Summarization/QA evaluation
Human-in-the-loop review workflows
Automated eval pipelines integrated into CI/CD
Support frameworks such as RAGAS G-Eval rubric scoring pairwise comparisons and test case
generation.
Build reusable tooling for teams to write run and track model evaluations.
5. Platform Engineering & Reusable Components
Develop shared libraries APIs and services for:
Prompt management/versioning
Embedding pipelines and model wrappers
Retrieval adapters
Common data loaders and document preprocessing
Tool/function schemas
Drive consistency across teams with standards reference architectures and best practices.
Review system designs across use cases to ensure alignment to platform patterns.
6. Collaboration & Enablement
Partner with AI engineers product teams and data scientists to understand cross-cutting needs and convert
them into reusable platform features.
Create documentation onboarding guides examples and developer tooling.
Provide internal training (brown bags workshops) on guardrails observability and evaluation frameworks.
Required Qualifications Technical Skills
5-10 years software engineering or ML infrastructure experience.
Strong Python engineering fundamentals (FastAPI async typing/Pydantic testing).
Experience with model safety/guardrails approaches (prompt injection defense PII redaction toxicity filters policy enforcement).
Hands-on with Arize AI LangSmith or similar LLM observability platforms.
Experience creating evaluation frameworks using RAGAS G-Eval or custom rubric systems.
Strong familiarity with vector databases (Pinecone Weaviate Milvus) embeddings and retrieval pipelines.
Solid understanding of LLM architectures tokenization embeddings context limits and RAG patterns.
Experience in cloud (GCP preferred) Kubernetes/GE containers and CI/CD.
Strong understanding of security governance DLP data privacy RBAC and enterprise compliance requirements.
Soft Skills
Strong documentation and communication skills.
Ability to influence engineering teams and standardize best practices.
Comfortable working across multiple stakeholders-platform security ML engineering product.
Nice to Have
Experience with LangChain/LangGraph or Llamalndex orchestrations.
Experience with Rebuff Protect AI or similar LLM security tooling.
Experience with GCP Vertex AI pipelines Model Monitoring and Vector Search.
Familiarity with knowledge graphs grounding models fact-checking models.
Building SDKs or developer frameworks adopted across multiple teams.
On-prem or hybrid AI deployment experience.
View more
View less