Required Skills : * Local-First AI Expertise: Proven track record deploying and optimizing open-source LLMs (e.g. LLaMA Mistral) in non-cloud restricted or air-gapped private infrastructures * Deep Framework Proficiency: Heavy hands-on experience with PyTorch Hugging Face and orchestration layers like LangChain LlamaIndex or equivalent frameworks * Vector and Retrieval Mastery: Direct experience engineering production-grade RAG architectures embeddings semantic search and local vector databases (e.g. FAISS Qdrant Milvus Chroma) * Containerization and Compute Infrastructure: Strong experience containerizing AI workloads via Docker/Kubernetes and managing dedicated GPU-based compute environments * Advanced ML Concepts: Solid understanding of fine-tuning techniques (LoRA/QLoRA) versus prompt engineering and model quantization formats (GGUF AWQ EXL2) * Autonomy: Ability to build test and iterate rapidly in an isolated development sandbox with zero dependency on third-party cloud APIs * Experience operating within heavily regulated or compliance-driven industries (e.g. high-governance data environments fintech or legal-tech) * Familiarity with local-first agentic workflows Model Context Protocol (MCP) or building fully internal developer copilots and autonomous knowledge systems
Basic Qualification :
Additional Skills :
Background Check : No
Drug Screen : No
Rank :A3 Requested Date :
Jobs Details 6 Month Visa Restrictions No sponsorship Locals Only/ Out of Area/ Remote REMOTE 3-5 Must Haves Strong local (non-cloud) AI experience Deep experience with AI frameworks pipelines RAG vector database expertise Infrastructure performance op...
Required Skills : * Local-First AI Expertise: Proven track record deploying and optimizing open-source LLMs (e.g. LLaMA Mistral) in non-cloud restricted or air-gapped private infrastructures * Deep Framework Proficiency: Heavy hands-on experience with PyTorch Hugging Face and orchestration layers like LangChain LlamaIndex or equivalent frameworks * Vector and Retrieval Mastery: Direct experience engineering production-grade RAG architectures embeddings semantic search and local vector databases (e.g. FAISS Qdrant Milvus Chroma) * Containerization and Compute Infrastructure: Strong experience containerizing AI workloads via Docker/Kubernetes and managing dedicated GPU-based compute environments * Advanced ML Concepts: Solid understanding of fine-tuning techniques (LoRA/QLoRA) versus prompt engineering and model quantization formats (GGUF AWQ EXL2) * Autonomy: Ability to build test and iterate rapidly in an isolated development sandbox with zero dependency on third-party cloud APIs * Experience operating within heavily regulated or compliance-driven industries (e.g. high-governance data environments fintech or legal-tech) * Familiarity with local-first agentic workflows Model Context Protocol (MCP) or building fully internal developer copilots and autonomous knowledge systems