Sr Data Scientist-Innovation lab

Genzeon Global

Not Interested
Bookmark
Report This Job

profile Job Location:

Hyderabad - India

profile Monthly Salary: Not Disclosed
profile Experience Required: 6years
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Sr Data Scientist


Job Responsibilities:


  • LLM Architecture: Good understanding of the architecture underlying large language models such as Transformer-based models and their variants. Design and implement deep learning model architectures using PyTorch.
  • Language Model Training and Fine-Tuning: Experience in training large-scale language models from scratch as well as fine-tuning pre-trained models on domain data.
  • Data Preprocessing for NLP: Skilled in preprocessing textual data including tokenization stemming lemmatization and handling of different text encoding.
  • Transfer Learning and Adaptation: Proficiency in applying transfer learning techniques to adapt existing LLMs to new languages domains or specific business needs.
  • Data Annotation and Evaluation: Skills in designing and implementing data annotation strategies for training LLMs and evaluating their performance using appropriate metrics.
  • Scalability and Deployment: Experience in scaling LLMs for production environments ensuring efficiency and robustness in deployment.
  • Model Training Optimization and Evaluation: Evaluate the performance of PyTorch models using appropriate metrics and techniques like cross-validation holdout sets or online evaluation. This encompasses the complete cycle of training fine-tuning and validating language models. You will be designing and adapting LLMs for use in virtual assistants Information retrieval and extraction etc.
  • Experimentation with Emerging Technologies and Methods: Actively exploring new technologies and methodologies in language model development including experimental frameworks and software tools.
  • LLM Alignment: Understanding of algorithms like DPO PPO KPO RLHF and using it for guardrails.
  • AI Data Retrieval: Data retrieval from unstructured data extract key value pairs using techniques like donut layoutLM table transformers.
  • Analyze data and build EDAs to identify data patterns Hands-on and strong understanding of concepts in Deep Learning and NLP Proficient in TensorFlow and similar libraries.


Required Qualifications

  • 5 years of hands-on experience in developing and deploying Large Language Models and Machine learning and working with Pytorch.
  • A thorough understanding of machine learning particularly deep learning techniques including knowledge of neural network architectures training methods and optimization algorithms.
  • Proficiency in AI technology Python including experience with NLP libraries (e.g. Hugging Face Transformers NLTK spaCy) text classification.
  • Experience with frameworks: PyTorch or Tensorflow.
  • Experience with cloud services (AWS Azure) and ML deployment tool Docker
  • Familiarity with model fine-tuning and optimization techniques for LLMs.
  • Proven track record of innovative solutions in the field of LLMs.
  • Strong communication skills with the ability to explain complex AI concepts to non-expert audiences.

Additional good to have qualifications:


  • 4 years experience in data analytics data science quantitative analysis using statistical computer languages to draw insights from large data sets 3 years experience in Python development preferably delivering production code for data applications.
  • Experience with unstructured data or computer vision models is a plus.
  • Experience with SQL is a big plus Extensive model implementation experience using Scikit.
  • Experience designing and developing for security critical applications; experience with the specifics for HIPAA/PHI/PII/GDPR a big plus.
  • Basic experience with Linux Git Jupyter Notebooks is must Knowledge of Agile development practices Flexibility and adaptability to respond to a rapidly changing environment.
  • Experience with distributed computational techniques and job orchestration tools and platforms is very valuable: airflow etc.





DataScience LLM GenerativeAI NLP PyTorch TensorFlow MachineLearning DeepLearning ArtificialIntelligence HuggingFace RLHF AIAlignment CloudAI

Education

Mtech

Sr Data Scientist Job Responsibilities: LLM Architecture: Good understanding of the architecture underlying large language models such as Transformer-based models and their variants. Design and implement deep learning model architectures using PyTorch.Language Model Training and Fine-Tuning: Experie...
View more view more

Key Skills

  • Catering
  • Apache Commons
  • Architectural Design
  • Human Resources Administration
  • Accident Investigation