PhD Project Proposal – CIFRE Collaboration with PAL Robotics

Pal Robotics

Not Interested
Bookmark
Report This Job

profile Job Location:

Toulouse - France

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

The central and concrete objective of this PhD thesis is to develop and deploy cutting-edge locomotion controllers on the bipedal robot Kangaroo Kangaroo22 developed by PAL Robotics using reinforcement learning techniques Hwangbo19. The Kangaroo platform presents a particularly high degree of mechanical complexity and nonlinearity which makes it extremely challenging to model accurately in simulation KangarooRL25. As a result standard sim-to-real approaches are difficult to apply with the Kangaroo robot exhibiting a larger-than-usual gap between simulation and physical reality. The thesis aims to study and characterize this gap and to propose novel learning-based control strategies capable of overcoming or mitigating it either by improving transferability or by learning directly on the real system.

This work will rely on strong experimental foundations and a tight collaboration between Gepetto at LAAS-CNRS and PAL Robotics. Gepetto brings extensive expertise in locomotion control and has successfully deployed reinforcement learningbased policies on multiple quadruped and biped robots SoloParkour2024. PAL Robotics on the other hand has designed and developed the Kangaroo platform and has already demonstrated a partial sim-to-real deployment of locomotion policies on real hardware.

The project will leverage the forthcoming Kangaroo prototype currently being assembled at PAL Robotics and expected to be delivered in Spring 2026. In addition the thesis will benefit from the experimental facilities at LAAS-CNRS including a large motion capture room equipped with a safety crane a complete fab lab and mechatronics workshop and several additional humanoid platforms such as Unitree H1 and R1. This environment will provide a unique opportunity to carry out extensive experimental validation and benchmarking of the developed learning-based controllers in both simulated and real-world conditions.

The PhD position is complemented by a 10-month engineering contract designed to provide technical and experimental support. This preliminary position is primarily intended to precede the start of the PhD allowing the selected candidate to become familiar with the underlying technologies and to contribute to the preparation of the Kangaroo platform at LAAS-CNRS before the scientific work this configuration the recruitment process would cover both stages: an engineering position at LAAS-CNRS (for example from January to October 2026) followed by the PhD contract under PAL France (from November 2026 to October 2029). Alternatively the engineering support contract could be allocated in parallel with the PhD to reinforce the experimental aspects of the project and assist in the maintenance testing and data collection on the Kangaroo robot throughout the thesis. This flexibility ensures that the candidate and the research team can make the most effective use of the available resources depending on the projects development timeline.

The proposed research follows a progressive methodology combining simulation-based pretraining real-world learning and iterative adaptation. The first phase will consist of training locomotion policies on an existing bipedal simulation model using modern physics engines such as MuJoCo or Isaac Gym. The simulation parameters will be carefully identified from real robot data IdRL2025 following methodologies successfully applied to the Bipetto robot at LAAS-CNRS. The specific actuation transmission of Kangaroo will be explicitly modeled to capture its unique mechanical characteristics Kangaroo24 KangarooRL25. The main objective of this phase is to obtain a baseline policy capable of generating simple yet stable locomotion behaviors that can be safely transferred to the real hardware.

Building on the initial deployment the research will then focus on residual learning approaches where small task-specific neural networks are trained directly on the real robot to refine the pretrained policy ResidualRL19 RLPT24. These residual models will adapt the baseline controller to compensate for unmodeled dynamics sensor drift and contact uncertainties. The workflow will follow an iterative cycle alternating between simulation and hardware: simulation phases will enable massive data collection and large-scale policy optimization while short and safe experimental sessions on the robot will provide sparse but high-value data to progressively refine the model and improve transferability. If relevant or necessary we may also consider iterating on a pre-trained world model initizalized in simulation and then fine tuned on the real robot while running the latest iteration of the policy FineTune25WM25.

Alternatively the project will explore direct training on the robot using off-policy reinforcement learning algorithms 20min22 8min25. These methods target real-world learning without relying on simulation or explicit modeling and have recently shown promising results on quadruped platforms. The key scientific challenge will be to adapt these approaches to bipedal locomotion where data collection is inherently riskier and safety constraints are more stringent. This part of the study will provide insights into how to efficiently and safely gather real-world data for humanoid control contributing to the broader understanding of reinforcement learning on complex robotic systems.

Overall the project aims to deliver both theoretical and practical advances: a framework for reinforcement learning that goes beyond the traditional sim-to-real paradigm validated on an industrial-grade bipedal platform. The work will produce fundamental methodological contributions suitable for publication in top-tier robotics and machine learning venues supported by open-source software developments. On the industrial side the project will demonstrate the capabilities of Kangaroo through highly visible experimental achievements potentially extending beyond standard bipedal walking toward dynamic whole-body movements such as jumps or parkour-like motions. These demonstrations will enhance both scientific impact and public visibility strengthening the collaboration between PAL Robotics and the academic partners


Qualifications :

  • MSc degree in Computer Science Robotics Artificial Intelligence or a closely related field
  • Excellent programming skills in C and solid proficiency in Python
  • Proven experience with ROS (Robot Operating System)
  • Strong understanding of locomotion control for legged or humanoid robots.
  • Experience with simulation tools such as MuJoCo Isaac Gym PyBullet or Gazebo
  • Knowledge of reinforcement learning methods for robot control.
  • Experience in software development under Linux-based operating systems
  • Strong team-working skills and a proactive attitude

Remote Work :

No


Employment Type :

Full-time

The central and concrete objective of this PhD thesis is to develop and deploy cutting-edge locomotion controllers on the bipedal robot Kangaroo Kangaroo22 developed by PAL Robotics using reinforcement learning techniques Hwangbo19. The Kangaroo platform presents a particularly high degree of mechan...
View more view more

Key Skills

  • Mechanical Design
  • 2D Animation
  • ASME Codes & Standards
  • Construction Estimating
  • Layout Design
  • Creo
  • Mechanical Engineering
  • Visual Basic
  • Autocad
  • Auto Estimating
  • Proposal Writing
  • MRP

About Company

PAL France is a robotics company based in Toulouse, France, dedicated to advanced Research and Development (R&D) in service robotics. As the official distribution partner of PAL Robotics, Europe’s leading service robotics provider, PAL France focuses on the development of cutting-edge ... View more

View Profile View Profile