drjobs Sr Data Scientist-Innovation lab

Sr Data Scientist-Innovation lab

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Jobs by Experience drjobs

6years

Job Location drjobs

Hyderabad - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Sr Data Scientist


Job Responsibilities:


  • LLM Architecture: Good understanding of the architecture underlying large language models such as Transformer-based models and their variants. Design and implement deep learning model architectures using PyTorch.
  • Language Model Training and Fine-Tuning: Experience in training large-scale language models from scratch as well as fine-tuning pre-trained models on domain data.
  • Data Preprocessing for NLP: Skilled in preprocessing textual data including tokenization stemming lemmatization and handling of different text encoding.
  • Transfer Learning and Adaptation: Proficiency in applying transfer learning techniques to adapt existing LLMs to new languages domains or specific business needs.
  • Data Annotation and Evaluation: Skills in designing and implementing data annotation strategies for training LLMs and evaluating their performance using appropriate metrics.
  • Scalability and Deployment: Experience in scaling LLMs for production environments ensuring efficiency and robustness in deployment.
  • Model Training Optimization and Evaluation: Evaluate the performance of PyTorch models using appropriate metrics and techniques like cross-validation holdout sets or online evaluation. This encompasses the complete cycle of training fine-tuning and validating language models. You will be designing and adapting LLMs for use in virtual assistants Information retrieval and extraction etc.
  • Experimentation with Emerging Technologies and Methods: Actively exploring new technologies and methodologies in language model development including experimental frameworks and software tools.
  • LLM Alignment: Understanding of algorithms like DPO PPO KPO RLHF and using it for guardrails.
  • AI Data Retrieval: Data retrieval from unstructured data extract key value pairs using techniques like donut layoutLM table transformers.
  • Analyze data and build EDAs to identify data patterns Hands-on and strong understanding of concepts in Deep Learning and NLP Proficient in TensorFlow and similar libraries.


Required Qualifications

  • 5 years of hands-on experience in developing and deploying Large Language Models and Machine learning and working with Pytorch.
  • A thorough understanding of machine learning particularly deep learning techniques including knowledge of neural network architectures training methods and optimization algorithms.
  • Proficiency in AI technology Python including experience with NLP libraries (e.g. Hugging Face Transformers NLTK spaCy) text classification.
  • Experience with frameworks: PyTorch or Tensorflow.
  • Experience with cloud services (AWS Azure) and ML deployment tool Docker
  • Familiarity with model fine-tuning and optimization techniques for LLMs.
  • Proven track record of innovative solutions in the field of LLMs.
  • Strong communication skills with the ability to explain complex AI concepts to non-expert audiences.

Additional good to have qualifications:


  • 4 years experience in data analytics data science quantitative analysis using statistical computer languages to draw insights from large data sets 3 years experience in Python development preferably delivering production code for data applications.
  • Experience with unstructured data or computer vision models is a plus.
  • Experience with SQL is a big plus Extensive model implementation experience using Scikit.
  • Experience designing and developing for security critical applications; experience with the specifics for HIPAA/PHI/PII/GDPR a big plus.
  • Basic experience with Linux Git Jupyter Notebooks is must Knowledge of Agile development practices Flexibility and adaptability to respond to a rapidly changing environment.
  • Experience with distributed computational techniques and job orchestration tools and platforms is very valuable: airflow etc.





DataScience LLM GenerativeAI NLP PyTorch TensorFlow MachineLearning DeepLearning ArtificialIntelligence HuggingFace RLHF AIAlignment CloudAI

Education

Mtech

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.