DescriptionDevsinc is hiring a skilled AI & ML Engineer with more than 2 years of professional experience in building and fine-tuning Generative AI models (LLMs Diffusion Models) Vision-Language Models (VLMs) and both classical and deep learning systems developing solutions from scratch and taking them end-to-end into production.
This role combines modeling and MLOps expertise involving end-to-end ownership from model training and fine-tuning to optimization deployment and serving. Youll work on diverse high-impact projects such as Generative AI applications Stable Diffusion OCR theft detection and recommendation systems designing optimizing and serving custom models for real-world production use.
Key Responsibilities:
- Develop production inference stacks: Convert and optimize models (Torch ONNX TensorRT) quantize/prune profile FLOPs and latency and deliver low-latency GPU inference with minimal accuracy loss.
- Build robust model-serving infrastructure: Implement FastAPI/gRPC inference services token or frame-level streaming model versioning and routing autoscaling rollbacks and A/B testing.
- Create Computer Vision solutions from scratch: Design pipelines for object detection theft detection OCR (document parsing structured extraction) and surveillance analytics; fine-tune Hugging Face pretrained models when beneficial.
- Fine-tune Stable Diffusion and other generative models for brand- or style-consistent image generation and downstream vision tasks.
- Train and fine-tune Vision-Language Models (VLMs) for multimodal tasks (captioning VQA multimodal retrieval) using both from-scratch and transfer-learning approaches.
- Design and adapt LLM-based Generative AI systems for conversational agents summarization RAG pipelines and domain-specific fine-tuning.
- Implement MLOps / LLMops / AIOps practices: Automate CI/CD for training and deployment manage datasets and experiments maintain model registries and monitor latency drift and performance with alerting and retraining pipelines.
- Develop data acquisition & ingestion pipelines: Build compliant scrapers collectors and scalable ingestion systems with proxy rotation and rate-limit handling.
- Integrate third-party models and APIs (Hugging Face OpenAI etc.) and design hybrid inference strategies combining local and cloud models for optimal performance.
Requirements- Education: Bachelors or Masters degree in Computer Science Artificial Intelligence or related field.
- Experience: 2 years of professional experience in AI/ML or relevant domains with a proven track record of developing training and deploying machine learning or deep learning models in real-world environments.
- Excellent understanding of classical ML (scikit-learn): regression classification clustering; able to design experiments and baselines.
- Strong expertise in Computer Vision: object detection segmentation OCR pipelines (training from scratch and transfer learning).
- Deep understanding of model optimization: quantization pruning distillation FLOPs analysis CUDA profiling mixed precision and inference performance trade-offs.
- Proven ability to design and train models from scratch (not only using pretrained checkpoints): architecture design loss functions training loops and evaluation.
- Hands-on experience with LLMs and diffusion-based models (e.g. Stable Diffusion).
- Proficiency with ONNX TensorRT TorchScript and serving frameworks (Triton TorchServe or ONNX Runtime).
- Skilled in GPU programming and CUDA optimization (profiling with nvprof/nsight memory management multi-GPU setups).
- Strong backend engineering in Python (FastAPI Flask) async programming WebSockets/SSE and RESTful API design.
- Experience with containerization and orchestration (Docker Kubernetes Helm) and deploying GPU workloads to AWS/GCP/Azure or on-prem clusters.
- Solid software engineering discipline: CI/CD testing code reviews reproducibility and version control.
- Nice-to-Have: Familiarity with privacy-preserving ML (differential privacy federated learning) and observability tools like Prometheus Grafana Sentry or OpenTelemetry.
- Collaborative open to knowledge-sharing and teamwork.
- Team Player willing to support peers and contribute to collective success.
- Growth Minded eager to learn improve and adapt to emerging technologies.
- Adaptable flexible in dynamic fast-paced environments.
- Customer-Centric focused on delivering solutions that create real business value.
DescriptionDevsinc is hiring a skilled AI & ML Engineer with more than 2 years of professional experience in building and fine-tuning Generative AI models (LLMs Diffusion Models) Vision-Language Models (VLMs) and both classical and deep learning systems developing solutions from scratch and taking t...
DescriptionDevsinc is hiring a skilled AI & ML Engineer with more than 2 years of professional experience in building and fine-tuning Generative AI models (LLMs Diffusion Models) Vision-Language Models (VLMs) and both classical and deep learning systems developing solutions from scratch and taking them end-to-end into production.
This role combines modeling and MLOps expertise involving end-to-end ownership from model training and fine-tuning to optimization deployment and serving. Youll work on diverse high-impact projects such as Generative AI applications Stable Diffusion OCR theft detection and recommendation systems designing optimizing and serving custom models for real-world production use.
Key Responsibilities:
- Develop production inference stacks: Convert and optimize models (Torch ONNX TensorRT) quantize/prune profile FLOPs and latency and deliver low-latency GPU inference with minimal accuracy loss.
- Build robust model-serving infrastructure: Implement FastAPI/gRPC inference services token or frame-level streaming model versioning and routing autoscaling rollbacks and A/B testing.
- Create Computer Vision solutions from scratch: Design pipelines for object detection theft detection OCR (document parsing structured extraction) and surveillance analytics; fine-tune Hugging Face pretrained models when beneficial.
- Fine-tune Stable Diffusion and other generative models for brand- or style-consistent image generation and downstream vision tasks.
- Train and fine-tune Vision-Language Models (VLMs) for multimodal tasks (captioning VQA multimodal retrieval) using both from-scratch and transfer-learning approaches.
- Design and adapt LLM-based Generative AI systems for conversational agents summarization RAG pipelines and domain-specific fine-tuning.
- Implement MLOps / LLMops / AIOps practices: Automate CI/CD for training and deployment manage datasets and experiments maintain model registries and monitor latency drift and performance with alerting and retraining pipelines.
- Develop data acquisition & ingestion pipelines: Build compliant scrapers collectors and scalable ingestion systems with proxy rotation and rate-limit handling.
- Integrate third-party models and APIs (Hugging Face OpenAI etc.) and design hybrid inference strategies combining local and cloud models for optimal performance.
Requirements- Education: Bachelors or Masters degree in Computer Science Artificial Intelligence or related field.
- Experience: 2 years of professional experience in AI/ML or relevant domains with a proven track record of developing training and deploying machine learning or deep learning models in real-world environments.
- Excellent understanding of classical ML (scikit-learn): regression classification clustering; able to design experiments and baselines.
- Strong expertise in Computer Vision: object detection segmentation OCR pipelines (training from scratch and transfer learning).
- Deep understanding of model optimization: quantization pruning distillation FLOPs analysis CUDA profiling mixed precision and inference performance trade-offs.
- Proven ability to design and train models from scratch (not only using pretrained checkpoints): architecture design loss functions training loops and evaluation.
- Hands-on experience with LLMs and diffusion-based models (e.g. Stable Diffusion).
- Proficiency with ONNX TensorRT TorchScript and serving frameworks (Triton TorchServe or ONNX Runtime).
- Skilled in GPU programming and CUDA optimization (profiling with nvprof/nsight memory management multi-GPU setups).
- Strong backend engineering in Python (FastAPI Flask) async programming WebSockets/SSE and RESTful API design.
- Experience with containerization and orchestration (Docker Kubernetes Helm) and deploying GPU workloads to AWS/GCP/Azure or on-prem clusters.
- Solid software engineering discipline: CI/CD testing code reviews reproducibility and version control.
- Nice-to-Have: Familiarity with privacy-preserving ML (differential privacy federated learning) and observability tools like Prometheus Grafana Sentry or OpenTelemetry.
- Collaborative open to knowledge-sharing and teamwork.
- Team Player willing to support peers and contribute to collective success.
- Growth Minded eager to learn improve and adapt to emerging technologies.
- Adaptable flexible in dynamic fast-paced environments.
- Customer-Centric focused on delivering solutions that create real business value.
View more
View less