Must Have Technical/Functional Skills
Advanced Python development for ML/AI workloads
End to end ML lifecycle: model training evaluation fine tuning and labeling/tagging workflows
Generative AI systems design including LLM-based application development
Prompt engineering optimization for large language models
Document AI pipelines: OCR/extraction parsing normalization and text chunking for structured & unstructured data
Embedding generation pipelines for semantic search and retrieval
Vector similarity search implementation using vector databases
ML model integration with Vector DBs and MongoDB
Production grade ML engineering: scalable maintainable and deployment ready code
Python Large Language Models (LLMs) (via LLM based applications) Vector Databases MongoDB
Roles & Responsibilities
We are seeking a highly skilled Data Science Engineer to design and develop scalable ML and Generative AI solutions. The ideal candidate will have deep expertise in Python hands-on experience in model training document processing pipelines and strong knowledge of vector databases and modern ML/GenAI frameworks.
Strong fit if the candidate:
Has expert level Python skills
Has hands on experience building ML/GenAI systems not just theoretical knowledge
Has worked on end to end ML pipelines (data model deployment)
Has experience with document AI embeddings and vector search
Thinks like an engineer (scalable maintainable production ready code)
Likely not a fit if the candidate is:
Primarily a BI / reporting analyst
Focused only on statistical modeling or academic research
Lacking experience with deployment pipelines or GenAI systems
Key Responsibilities
Develop and deploy machine learning and GenAI solutions using Python
Design and optimize prompt engineering strategies for LLM-based applications
Build document extraction parsing and chunking pipelines for structured and unstructured data
Train evaluate and fine-tune ML models; manage tagging and labeling workflows
Implement embedding generation and vector search solutions
Integrate ML models with Vector DBs and MongoDB
Ensure code quality scalability and production readiness