PhD Fellowship LLM Architecture Optimization fmd

Aleph Alpha

Posted on : 07-04-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Heidelberg - Germany

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 07-04-2025

Job Description

Aleph Alpha Researchs mission is to deliver categorydefining AI innovation that enables open accessible and trustworthy deployment of GenAI in industrial applications. Our organization develops foundational models and nextgeneration methods that make it easy and affordable for Aleph Alphas customers to increase productivity in development engineering logistics and manufacturing processes.

We are looking to grow our academic partnership Lab1141 with TU Darmstadt and our GenAI group of PhD students supervised by Prof. Dr. Kersting. We are looking for an enthusiastic researcher at heart passionate to improve foundational multimodal NLP models and aiming to obtain a PhD degree in a threeyear program. On average you will spend half of your time at Aleph Alpha Research in Heidelberg and the other half at the Technical University of Darmstadt which is closeby to travel.

As a PhD fellow in Aleph Alpha Research you develop new approaches to improve the foundational model architecture and applications. You are given a unique research environment with sufficient amount of compute and both industrial and academic professional supervisors to conduct and publish your research.

Please formulate your dream research topic in your application letter that is aligned to fit the Foundation Models team.

While at Aleph Alpha Research for the LLM Architecture topic you will be working with our Foundational Models team in which you create powerful stateofthe art multimodal foundational models research and share novel approaches to pretraining finetuning and helpfulness and enable costefficient inference on a variety of accelerators.

Topic:

Introduction

Foundation models are central to many of the most innovative applications in deep learning predominantly utilize selfsupervised learning autoregressive generation and transformer architecture. However the learning paradigm and architecture come with several challenges. To address these limitations and improve both accuracy and efficiency in generation and downstream tasks it is essential to consider adjustments to its core paradigms. These include the sourcing and composition of training data design choices of the training itself and the underlying model architecture. Further extensions of the system such as RetrievalAugmented Generation (RAG) and changes to foundational components like tokenizers should be considered.

Related Work

The training data of LLMs is at the core of a models downstream capabilities. Consequently recent works focus on extracting highquality data from large corpora LLama3 Olmo1.7. Additionally the order and structure in which the data is presented to the model have a large influence on model performance as demonstrated by curriculum learning approaches Olmo1.7 Ormazabal et al. Mukherjee et al and more sophisticated data packing algorithms Staniszewski et al. Shi et al.

Further the LLM architecture and its components leave room for improvement.Ainslie et al. introduce groupedquery attention (GQA) which increases the efficiency of the transformers attention component. Liu et al make changes to the rotary position embeddings to improve longcontext understanding.

Recently structured statespace sequence models (SSMs) Gu et al. Poli et al. and hybrid architectures have emerged as promising class of architectures for sequence modeling.

Lastly the model itself can be embedded in a larger system such as RAG. For example incontext learning via RAG enhances the generations accuracy and credibility Gao et al. particularly for knowledgeintensive tasks and allows for continuous knowledge updates and integration of domainspecific information.

Goals

This project aims to explore novel LLMsystem architectures data and training paradigms that could either replace or augment traditional autoregressive generation and transformer components as well as enhance auxiliary elements such as retrievers and tokenizers.

Your responsibilities:

Research and development of novel approaches and algorithms that improve training inference interpretations or applications of foundational models
Analysis and benchmarking of stateofthe art as well as new approaches
Collaborating with scientists and engineers at Aleph Alpha and Aleph Alpha Research plus chosen external industrial and academic partners
In particular fruitful interactions with our group of GenAI PhD students and fostering exchange between Aleph Alpha Research and your university
Publishing own and collaborative work on machine learning venues and making code and models sourceavailable for use by the broader research community

Your profile:

Masters Degree in Computer Science Mathematics or similar
Solid understanding of DL/ML techniques algorithms and tools for training and inference
Experience and knowledge of Python and at least one common deeplearning framework preferably PyTorch
Ready to relocate to region Heidelberg/ Darmstadt Germany
Interest to bridge the gap between addressing practical industry challenges and contributing to academic research
Ambition to obtain a PhD in generative machine learning in a threeyear program

Our tenets

We believe embodying these values would make you a great fit in our team:

We own work endtoend from idea to production: You take responsibility for every stage of the process ensuring that our work is complete scalable and of the highest quality.
We ship what matters: Your focus is on solving real problems for our customers and the research community. You prioritize delivering impactful solutions that bring value and make a difference.
We work transparently: You collaborate and share your results openly with the team partners customers and the broader community through publishing and sharing results and insight including blogposts papers checkpoints and more.
We innovate through leveraging our intrinsic motivations and talents: We strive for technical depth and to balance ideas and interests of our team with our missionbackwards approach and leverage the interdisciplinary diverse perspectives in our teamwork.

What you can expect from us

Become part of an AI revolution!
30 days of paid vacation
Public transport subsidy
Fitness and wellness offerings (Wellhub)
Mental health platform nilo.health
Sharepartsofyourworkviapublicationsand sourceavailablecode
Flexible working hours and hybrid working model

Employment Type

Full-Time

Company Industry

Key Skills

Apply Now

About Company

Aleph Alpha

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

PhD Fellowship LLM Architecture Optimization fmd

Aleph Alpha

Job Description

Topic:

Your responsibilities:

Your profile:

Our tenets

What you can expect from us

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Optimization Manager

Psychologist (PsyD, PhD)

DoD SkillBridge Fellowship (onsite)

Clinical Psychologist (PsyD, PhD)

Clinical Psychologist (PsyD, PhD)

Specialist, Solutions Architecture

Physician Assistant - Pediatric Surgery Fellowship

Manager, Cyber Architecture