Job Title
Generative AI Engineer (Data/ML/GenAI)
Jerser City NJ
Full-time
Job Summary
We re hiring a Generative AI Engineer with 6 years across Data/ML/GenAI who can design build and productionize LLM-powered systems end-to-end. You ll select and fine-tune models (OpenAI Anthropic Google Meta open-source) craft robust RAG/agentic workflows (AutoGen LangGraph CrewAI LangChain/LlamaIndex) and ship secure observable services with FastAPI Docker and Kubernetes. You pair strong software engineering with MLOps/LLMOps rigor evaluation monitoring safety/guardrails and cost/latency optimization.
Key Responsibilities
- Solution architecture: Own E2E design for chat/agents structured generation summarization/classification and workflow automation. Choose the right model vs. non-LLM alternatives and justify trade-offs.
- Prompting & tuning: Build prompt stacks (system/task/tool) synthetic data pipelines and fine-tune or LoRA adapters; apply instruction tuning/RLHF where warranted.
- Agentic systems: Implement multi-agent/tool-calling workflows using AutoGen LangGraph CrewAI (state management retries tool safety fallbacks grounding).
- RAG at scale: Stand up retrieval stacks with vector DBs (Pinecone/Faiss/Weaviate/pgvector) chunking and citation strategies reranking and caching; enforce traceability.
- APIs & deployment: Ship FastAPI services containerize (Docker) orchestrate (Kubernetes/Cloud Run) wire CI/CD and IaC; design SLAs/SLOs for reliability and cost.
- LLMOps & observability: Instrument evals (unit/regression/AB) add tracing and metrics (Langfuse LangSmith OpenTelemetry) and manage model/version registries (MLflow/W&B).
- Safety & governance: Implement guardrails (prompt injection/PII/toxicity) policy filters (Bedrock Guardrails/Azure AI Content Safety/OpenAI Moderation) access controls and compliance logging.
- Data & pipelines: Build/maintain data ingestion cleansing and labeling workflows for model/retrieval corpora; ensure schema/version governance.
- Performance & cost: Optimize with batching streaming JSON-schema/function calling tool-use speculative decoding/KV caching and token budgets.
- Collaboration & mentoring: Partner with product/engineering/DS; review designs/PRs mentor juniors and drive best practices/playbooks.
Preferred Qualifications
- Agent ecosystems: Deeper experience with multi-agent planning/execution tool catalogs and failure-mode design.
- Search & data stores: Experience with pgvector/Elasticsearch/OpenSearch; comfort with relational/NoSQL/graph stores.
- Advanced evals: Human-in-the-loop pipelines golden sets regression suites and cost/quality dashboards.
- Open-source & thought leadership: OSS contributions publications talks or a strong portfolio demonstrating GenAI craftsmanship.
Nice to Have
- Eventing & rate limiting: Redis/Celery task queues and concurrency controls for bursty LLM traffic.
- Enterprise integrations: Experience with API gateways (e.g. MuleSoft) authN/Z and vendor compliance reviews.
Domain experience: Prior work in data-heavy or regulated domains (finance/health/gov) with auditable GenAI outputs.
Requirements
- Experience: 6 years across Data/ML/GenAI with 1 2 years designing and shipping LLM or GenAI apps to production.
- Languages & APIs: Strong Python and FastAPI; proven experience building secure reliable REST services and integrations.
- Models & frameworks: Hands-on with OpenAI/Anthropic/Gemini/Llama families and at least two of: AutoGen LangGraph CrewAI LangChain LlamaIndex Transformers.
- RAG & retrieval: Practical experience implementing vector search and reranking plus offline/online evals (e.g. RAGAS promptfoo custom harnesses).
- Cloud & DevOps: Docker Kubernetes (or managed equivalents) and one major cloud (AWS/Azure/GCP); CI/CD and secrets management.
- Observability: Familiarity with tracing/metrics tools (e.g. Langfuse LangSmith OpenTelemetry) and setting SLIs/SLOs.
- Security & governance: Working knowledge of data privacy PII handling content safety and policy/controls for enterprise deployments.
- Communication: Clear technical writing and cross-functional collaboration; ability to translate business goals into architecture and milestones.
Experience: 6+ years across Data/ML/GenAI, with 1 2+ years designing and shipping LLM or GenAI apps to production. Languages & APIs: Strong Python and FastAPI; proven experience building secure, reliable REST services and integrations. Models & frameworks: Hands-on with OpenAI/Anthropic/Gemini/Llama families and at least two of: AutoGen, LangGraph, CrewAI, LangChain, LlamaIndex, Transformers. RAG & retrieval: Practical experience implementing vector search and reranking, plus offline/online evals (e.g., RAGAS, promptfoo, custom harnesses). Cloud & DevOps: Docker, Kubernetes (or managed equivalents), and one major cloud (AWS/Azure/GCP); CI/CD and secrets management. Observability: Familiarity with tracing/metrics tools (e.g., Langfuse, LangSmith, OpenTelemetry) and setting SLIs/SLOs. Security & governance: Working knowledge of data privacy, PII handling, content safety, and policy/controls for enterprise deployments. Communication: Clear technical writing and cross-functional collaboration; ability to translate business goals into architecture and milestones.
Education
Experience: 6+ years across Data/ML/GenAI, with 1 2+ years designing and shipping LLM or GenAI apps to production. Languages & APIs: Strong Python and FastAPI; proven experience building secure, reliable REST services and integrations. Models & frameworks: Hands-on with OpenAI/Anthropic/Gemini/Llama families and at least two of: AutoGen, LangGraph, CrewAI, LangChain, LlamaIndex, Transformers. RAG & retrieval: Practical experience implementing vector search and reranking, plus offline/online e