Were hiring a Data Scientist focused on Large Language Models to join our AI R&D team. Youll work on designing training and optimizing LLMs that power real product features and internal tools. Youll touch everything from architecture and data preparation to multi-scale training and evaluation.
This is not a prompt engineering or RAG-focused role. Were looking for someone who gets the math understands the models and knows how to write code that scales.
Youll work side-by-side with engineers MLOps and product teams to make our LLMs smarter faster and more useful.
What Youll Do:
Train and fine-tune LLMs using multi-GPU and distributed setups.
Analyze model performance debug failures and implement improvements.
Work with curated and synthetic datasets including classification and generation tasks.
Design experiments track results and iterate quickly with tools like MLflow.
Write clean production-ready code and collaborate via GitHub.
Push boundaries: think about architecture memory efficiency scale and cost.
What Were Looking For:
LLM understanding: You know how these models work under the hood - transformer internals tokenization embeddings etc.
Stats and ML fundamentals: You have a solid foundation in statistics machine learning and optimization.
Coding skills: You write Python well. Youve worked with PyTorch or JAX. You dont fear Bash or git.
Training experience: Youve trained models beyond notebooks. Bonus if youve worked with mixed precision DeepSpeed or multi-node training.
Systems mindset: You understand trade-offs - throughput vs memory latency vs accuracy.
Good judgment: You know when to read a paper when to read a stack trace and when to rewrite the dataloader or part of the LLM architecture.
Bonus Points:
Experience with Hugging Face Transformers Datasets Accelerate.
Familiarity with Kubernetes Ray or custom training infra.
Exposure to embeddings classification tasks or token-level losses.
Ability to mentor or guide junior researchers or engineers.
How We Hire:
Online assessment: Online assessment: technical logic and fundamentals (Math/Calculus Statistics Probability Machine Learning/Deep Learning Code).
Technical interview: dive into theory and reasoning (no code).
Cultural interview.
If you are not willing to take an online quiz do not apply.
Diversity and inclusion:
We believe in social inclusion respect and appreciation of all people. We promote a welcoming work environment where each CloudWalker can be authentic regardless of gender ethnicity race religion sexuality mobility disability or education.
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.