Job Description:
We are looking for a Data Scientist with expertise in Python Azure Cloud NLP Forecasting and largescale data processing. The role involves enhancing existing ML models optimising embeddings LDA models RAG architectures and forecasting models and migrating data pipelines to Azure Databricks for scalability and efficiency.
Key Responsibilities:
Model Development
- Model Development & Optimisation
- Train and optimise models for new data providers ensuring seamless integration.
- Enhance models for dynamic input handling.
- Improve LDA model performance to handle a higher number of clusters efficiently.
- Optimise RAG (RetrievalAugmented Generation) architecture to enhance recommendation accuracy for large datasets.
- Upgrade Retrieval QA architecture for improved chatbot performance on large datasets.
Forecasting & Time Series Modelling
- Develop and optimise forecasting models for marketing demand prediction and trend analysis.
- Implement time series models (e.g. ARIMA Prophet LSTMs) to improve business decisionmaking.
- Integrate NLPbased forecasting leveraging customer sentiment and external data sources (e.g. news social media).
Data Pipeline & Cloud Migration
- Migrate the existing pipeline from Azure Synapse to Azure Databricks and retrain models accordingly Note: this is required only for the AUB role(s)
- Address space and time complexity issues in embedding storage and retrieval on Azure Blob Storage.
- Optimise embedding storage and retrieval in Azure Blob Storage for better efficiency.
MLOps & Deployment
- Implement MLOps best practices for model deployment on Azure ML Azure Kubernetes Service (AKS) and Azure Functions.
- Automate model training inference pipelines and API deployments using Azure services.
Experience:
- Experience in Data Science Machine Learning Deep Learning and Gen AI.
- Design Architect and Execute end to end Data Science pipelines which includes Data extraction data preprocessing Feature engineering Model building tuning and Deployment.
- Experience in leading a team and responsible for project delivery.
- Experience in Building end to end machine learning pipelines with expertise in developing CI/CD pipelines using Azure Synapse pipelines Databricks Google Vertex AI and AWS.
- Experience in developing advanced natural language processing (NLP) systems specializing in building RAG (RetrievalAugmented Generation) models using Langchain. Deploy RAG models to production.
- Have expertise in building Machine learning pipelines and deploy various models like Forecasting models Anomaly Detection models Market Mix Models Classification models Regression models and Clustering Techniques.
- Maintaining Github repositories and cloud computing resources for effective and efficient version control development testing and production.
- Developing proofofconcept solutions and assisting in rolling these out to our clients.
Required Skills & Qualifications:
- Handson experience with Azure Databricks Azure ML Azure Synapse Azure Blob Storage and Azure Kubernetes Service (AKS).
- Experience with forecasting models time series analysis and predictive analytics.
- Proficiency in Python (NumPy Pandas TensorFlow PyTorch Statsmodels Scikitlearn Hugging Face FAISS).
- Experience with model deployment API optimisation and serverless architectures.
- Handson experience with Docker Kubernetes and MLflow for tracking and scaling ML models.
- Expertise in optimising time complexity memory efficiency and scalability of ML models in a cloud environment.
- Experience with Langchain or equivalent and RAG and multiagentic generation
Location:
DGS India Bengaluru Manyata N1 Block
Brand:
Merkle
Time Type:
Full time
Contract Type:
Permanent