Software Engineer AIML

Lahore - Pakistan

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Description

Devsinc is hiring a skilled AI/ML Engineer with at least 3 years of hands-on experience in building and fine-tuning Generative AI models (LLMs Diffusion Models) Vision-Language Models (VLMs) and both classical and deep learning systems developing solutions from scratch and taking them end-to-end into production.

This role combines modeling and MLOps expertise involving end-to-end ownership from model training and fine-tuning to optimization deployment and serving. Youll work on diverse high-impact projects such as Generative AI applications Stable Diffusion OCR theft detection and recommendation systems designing optimizing and serving custom models for real-world production use.

Key Responsibilities:

Develop production inference stacks: Convert and optimize models (Torch ONNX TensorRT) quantize/prune profile FLOPs and latency and deliver low-latency GPU inference with minimal accuracy loss.
Build robust model-serving infrastructure: Implement FastAPI/gRPC inference services token or frame-level streaming model versioning and routing autoscaling rollbacks and A/B testing.
Create Computer Vision solutions from scratch: Design pipelines for object detection theft detection OCR (document parsing structured extraction) and surveillance analytics; fine-tune Hugging Face pretrained models when beneficial.
Fine-tune Stable Diffusion and other generative models for brand- or style-consistent image generation and downstream vision tasks.
Train and fine-tune Vision-Language Models (VLMs) for multimodal tasks (captioning VQA multimodal retrieval) using both from-scratch and transfer-learning approaches.
Design and adapt LLM-based Generative AI systems for conversational agents summarization RAG pipelines and domain-specific fine-tuning.
Implement MLOps / LLMops / AIOps practices: Automate CI/CD for training and deployment manage datasets and experiments maintain model registries and monitor latency drift and performance with alerting and retraining pipelines.
Develop data acquisition & ingestion pipelines: Build compliant scrapers collectors and scalable ingestion systems with proxy rotation and rate-limit handling.
Integrate third-party models and APIs (Hugging Face OpenAI etc.) and design hybrid inference strategies combining local and cloud models for optimal performance.

Requirements

Education: Bachelors or Masters degree in Computer Science Artificial Intelligence or related field.
Experience: Minimum 3 years of professional experience in AI/ML or related domains.
Strong expertise in Computer Vision: object detection segmentation OCR pipelines (training from scratch and transfer learning).
Deep understanding of model optimization: quantization pruning distillation FLOPs analysis CUDA profiling mixed precision and inference performance trade-offs.
Proven ability to design and train models from scratch including architecture design loss functions training loops and evaluation.
Hands-on experience with LLMs and diffusion-based models (e.g. Stable Diffusion).
Proficiency with ONNX TensorRT TorchScript and serving frameworks (Triton TorchServe or ONNX Runtime).
Skilled in GPU programming and CUDA optimization (profiling with nvprof/nsight memory management multi-GPU setups).
Strong backend engineering in Python (FastAPI Flask) async programming WebSockets/SSE and RESTful API design.
Experience with containerization and orchestration (Docker Kubernetes Helm) and deploying GPU workloads to AWS/GCP/Azure or on-prem clusters.
Understanding of classical ML techniques (regression classification clustering) and experiment design.
Solid software engineering discipline: CI/CD testing code reviews reproducibility and version control.
Nice-to-Have: Familiarity with privacy-preserving ML (differential privacy federated learning) and observability tools like Prometheus Grafana Sentry or OpenTelemetry.
Collaborative open to knowledge-sharing and teamwork.
Team Player willing to support peers and contribute to collective success.
Growth Minded eager to learn improve and adapt to emerging technologies.
Adaptable flexible in dynamic fast-paced environments.
Customer-Centric focused on delivering solutions that create real business value.

DescriptionDevsinc is hiring a skilled AI/ML Engineer with at least 3 years of hands-on experience in building and fine-tuning Generative AI models (LLMs Diffusion Models) Vision-Language Models (VLMs) and both classical and deep learning systems developing solutions from scratch and taking them end...

Description

Develop production inference stacks: Convert and optimize models (Torch ONNX TensorRT) quantize/prune profile FLOPs and latency and deliver low-latency GPU inference with minimal accuracy loss.
Build robust model-serving infrastructure: Implement FastAPI/gRPC inference services token or frame-level streaming model versioning and routing autoscaling rollbacks and A/B testing.
Create Computer Vision solutions from scratch: Design pipelines for object detection theft detection OCR (document parsing structured extraction) and surveillance analytics; fine-tune Hugging Face pretrained models when beneficial.
Fine-tune Stable Diffusion and other generative models for brand- or style-consistent image generation and downstream vision tasks.
Train and fine-tune Vision-Language Models (VLMs) for multimodal tasks (captioning VQA multimodal retrieval) using both from-scratch and transfer-learning approaches.
Design and adapt LLM-based Generative AI systems for conversational agents summarization RAG pipelines and domain-specific fine-tuning.
Implement MLOps / LLMops / AIOps practices: Automate CI/CD for training and deployment manage datasets and experiments maintain model registries and monitor latency drift and performance with alerting and retraining pipelines.
Develop data acquisition & ingestion pipelines: Build compliant scrapers collectors and scalable ingestion systems with proxy rotation and rate-limit handling.
Integrate third-party models and APIs (Hugging Face OpenAI etc.) and design hybrid inference strategies combining local and cloud models for optimal performance.

Requirements

Education: Bachelors or Masters degree in Computer Science Artificial Intelligence or related field.
Experience: Minimum 3 years of professional experience in AI/ML or related domains.
Strong expertise in Computer Vision: object detection segmentation OCR pipelines (training from scratch and transfer learning).
Deep understanding of model optimization: quantization pruning distillation FLOPs analysis CUDA profiling mixed precision and inference performance trade-offs.
Proven ability to design and train models from scratch including architecture design loss functions training loops and evaluation.
Hands-on experience with LLMs and diffusion-based models (e.g. Stable Diffusion).
Proficiency with ONNX TensorRT TorchScript and serving frameworks (Triton TorchServe or ONNX Runtime).
Skilled in GPU programming and CUDA optimization (profiling with nvprof/nsight memory management multi-GPU setups).
Strong backend engineering in Python (FastAPI Flask) async programming WebSockets/SSE and RESTful API design.
Experience with containerization and orchestration (Docker Kubernetes Helm) and deploying GPU workloads to AWS/GCP/Azure or on-prem clusters.
Understanding of classical ML techniques (regression classification clustering) and experiment design.
Solid software engineering discipline: CI/CD testing code reviews reproducibility and version control.
Nice-to-Have: Familiarity with privacy-preserving ML (differential privacy federated learning) and observability tools like Prometheus Grafana Sentry or OpenTelemetry.
Collaborative open to knowledge-sharing and teamwork.
Team Player willing to support peers and contribute to collective success.
Growth Minded eager to learn improve and adapt to emerging technologies.
Adaptable flexible in dynamic fast-paced environments.
Customer-Centric focused on delivering solutions that create real business value.

Key Skills

Spring
.NET
C/C++
Go
React
OOP
C#
Data Structures
JavaScript
Software Development
Java
Distributed Systems

Apply Now

About Company

Devsinc

Devsinc helps startups, enterprises and public sector clients accelerate their technology life cycle, by unlocking access to 2,000+ passionate and experienced solution providers with experience in 100+ technologies in their timezone.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click