Job Description:
Key Responsibilities:
- Build robust AI/ML platforms on GCP ensuring scalability reliability and performance.
- Set up and maintain GCP services such as AI Platform BigQuery Cloud Storage Compute Engine and Kubernetes Engine.
- Develop automated workflows and pipelines for model training validation deployment and monitoring.
- Work closely with data scientists ML engineers and other stakeholders to understand their needs and provide optimal solutions.
- Continuously optimize the AI/ML infrastructure for cost performance and security.
- Implement monitoring solutions to ensure the health and performance of AI/ML systems and troubleshoot any issues that arise.
- Maintain comprehensive documentation of the architecture workflows and best practices.
Preferred Qualifications:
- Proven experience as an AI/ML engineer with a focus on platform engineering and GCP.
- Experience with machine learning frameworks and libraries such as TensorFlow PyTorch or Scikitlearn.
- Experience in Terraform and Helm will be an advantage.
- Experience in implementing inferencing benchmarking and finetuning of Generative AI models using GCP services.
- Strong understanding of LLMs and proven experience building platforms and applications that leverage them.
- Experience with CI/CD pipelines containerization (Docker) and orchestration (Kubernetes).
- Strong understanding of data storage processing and ETL workflows.
- Knowledge in chunking strategies for vector database
- Knowledge on classification & Embedding models.
- Familiarity with Agile methodologies and project management tools.
Desired Qualifications:
- Excellent problemsolving skills and the ability to work in a fastpaced environment.
- Strong communication and collaboration skills to work effectively with crossfunctional teams.
- 2 years of designing developing testing optimizing python microservices using gRPC
- 2 years of Python development experience
- 3 years of Linux experience
Skills
Mandatory Skills : GCP AI Services GCP Gemini GCP Vertex AI