Vision-Language Model (VLM) Engineer

Wide And Wise

Job Location:

İstanbul - Turkey

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

We are seeking a highly skilled Vision-Language Model (VLM) Engineer to design develop and deploy state-of-the-art multimodal AI systems. You will work at the intersection of computer vision and natural language processing contributing to cutting-edge products that combine image and text understanding.

Key Responsibilities:

Design and implement vision-language models for tasks such as image captioning visual question answering and cross-modal retrieval

Train fine-tune and evaluate multimodal models using large-scale datasets

Optimize model performance for scalability and real-world deployment

Collaborate with cross-functional teams including data scientists software engineers and product managers

Stay up to date with the latest research in multimodal AI and apply it to production systems

Required Qualifications:

Bachelors or Masters degree in Computer Science Artificial Intelligence or a related field

Strong experience with Python and deep learning frameworks (e.g. PyTorch or TensorFlow)

Solid understanding of machine learning computer vision and NLP concepts

Experience with multimodal models or related architectures (e.g. transformers)

Familiarity with handling large datasets and distributed training

Preferred Qualifications:

Experience with models such as CLIP BLIP or similar multimodal architectures

Knowledge of model deployment (Docker APIs cloud services)

Publications or contributions to AI research projects

Experience working with real-world AI applications

Key Skills

Apply Now

About Company

Wide And Wise

Wide and Wise is a top recruitment agency with offices in Istanbul, Milan, and Dubai, connecting exceptional talent with leading companies across EMEA, MENA, and the US.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click