Generative AI Engineer

Raritan, NJ - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Job Summary (List Format):

- Design and implement scalable Retrieval-Augmented Generation (RAG) pipelines for large-scale document processing and semantic chunking from cloud sources (Azure Blob AWS S3).
- Integrate and tune embedding models and vector databases (e.g. Milvus) for high-performance document retrieval.
- Develop hybrid retrieval systems (BM25 and vector search) and optimize retrieval performance using key metrics (MRR ).
- Apply and fine-tune large language models (LLMs) for complex NLP tasks such as named-entity recognition question answering and summarization.
- Build and optimize prompt engineering solutions ensuring structured outputs and reducing model hallucinations.
- Develop containerize and deploy agent-based microservices (FastAPI Azure Functions); define Infrastructure as Code (Terraform/ARM).
- Establish CI/CD workflows (GitHub Actions) for automated testing and deployment including monitoring and alerting for system performance and SLA compliance.
- Profile and optimize system performance and cost (batching caching early-stopping) while producing high-quality documentation and architecture diagrams.
- Ensure security and compliance with industry regulations (HIPAA/GxP) including data encryption and PII redaction.
- Collaborate onsite three days a week (hybrid role; must be within driving distance).
- No visa sponsorship available for this position.

Job Summary (List Format): - Design and implement scalable Retrieval-Augmented Generation (RAG) pipelines for large-scale document processing and semantic chunking from cloud sources (Azure Blob AWS S3). - Integrate and tune embedding models and vector databases (e.g. Milvus) for high-performance d...