Masters Thesis Monocular 3D Object Detection

Vienna - Austria

Monthly Salary: Not Disclosed

Posted on: 09-11-2025

Vacancies: 1 Vacancy

Job Summary

Introduction: Conventional single-shot object detection neural networks such as YOLO have achieved remarkable success in identifying and localizing objects within 2D images using axis-aligned rectangular bounding boxes. While effective for many applications these 2D representations lack crucial information about the objects true 3D pose dimensions and orientation in the real world. This limitation becomes significant in applications requiring a deeper understanding of the scene such as autonomous driving robotics and augmented reality.

The goal of this thesis is to extend the capabilities of single-shot object detection networks by developing training and evaluating a model that directly predicts oriented 3D bounding boxes from a single monocular image. This includes estimating the objects 3D location its dimensions (length width height) and its 3D orientation.

Motivation: Accurate and efficient monocular 3D object detection is a crucial task in various computer vision applications. Relying on a single camera offers advantages in terms of cost simplicity and ease of deployment compared to multi-camera or LiDAR-based systems. This thesis aims to contribute to the advancement of monocular 3D object detection by exploring and implementing a single-shot approach capable of predicting oriented 3D bounding boxes.

Tasks:

Literature Review on Monocular 3D Object Detection CNNs:

Conduct a comprehensive review of existing research in monocular 3D object detection using Convolutional Neural Networks (CNNs).
Address the issue of camera calibration:Thoroughly examine how different methods handle camera calibration parameters (intrinsic and extrinsic) and their impact on the accuracy of 3D object detection.
Analyze the strengths and weaknesses of different network architectures.

Investigation of System Constraints:

Identify and analyze the inherent challenges and constraints of monocular 3D object detection compared to methods utilizing depth information.
Consider factors such as:
- Scale ambiguity:The difficulty in determining the absolute size and distance of an object from a single 2D image.
- Occlusion:How occluded objects can affect the accuracy of 3D bounding box prediction.
- Viewpoint variation:The impact of different viewing angles on the perceived shape and size of objects.
- Computational resources:Consider the computational complexity and real-time requirements for potential applications.

Design and Development of an Oriented 3D Bounding Box CNN:

Based on the literature review and the identified system constraints design a novel or adapt an existing single-shot object detection CNN architecture to predict oriented 3D bounding boxes.
This will involve:
- Choosing an appropriate backbone network.
- Designing the output layers to predict the parameters of the 3D bounding box (e.g. center coordinates dimensions Euler angles or quaternions for orientation).
- Defining a suitable loss function that incorporates the different aspects of 3D bounding box prediction.

Training and Evaluation on Real-World Data:

Select a suitable real-world dataset(s) with 3D object annotations (both Kapsch proriatory and public).
Implement and train the model
Evaluate the performance of the trained model using appropriate 3D object detection metrics (e.g. Average Precision with different IoU thresholds in 3D space).
Analyze the results identify limitations and discuss potential future improvements.

Expected Deliverables:

A comprehensive literature review on monocular 3D object detection This thesis provides an excellent opportunity to delve into the challenging and rapidly evolving field of monocular 3D object detection. The student will gain practical experience in literature review deep learning model design implementation training and evaluation on real-world data.

CNNs.
A detailed description of the designed and implemented model architecture.
A thorough evaluation of the models performance on real-world data.
A written thesis document summarizing the research process findings and conclusions.
Potentially a working implementation of the developed model.

Your Profile:

Required Background Studies in Computer Science Software Engineering Information Technology Geoinformatics or related fields
Fluent English skills
Interest in technology
Willingness and ability to work independently
Excellent communication and teamwork skills
Conscientiousness and reliability
Strong analytical skills with a precise and structured approach
Start: Immediately
Duration: 36 months
Successful completion of the masters thesis will be rewarded with 3000.

Contact:

Edwin Frühwirth

Introduction: Conventional single-shot object detection neural networks such as YOLO have achieved remarkable success in identifying and localizing objects within 2D images using axis-aligned rectangular bounding boxes. While effective for many applications these 2D representations lack crucial info...

Tasks:

Literature Review on Monocular 3D Object Detection CNNs:

Conduct a comprehensive review of existing research in monocular 3D object detection using Convolutional Neural Networks (CNNs).
Address the issue of camera calibration:Thoroughly examine how different methods handle camera calibration parameters (intrinsic and extrinsic) and their impact on the accuracy of 3D object detection.
Analyze the strengths and weaknesses of different network architectures.

Investigation of System Constraints:

Identify and analyze the inherent challenges and constraints of monocular 3D object detection compared to methods utilizing depth information.
Consider factors such as:
- Scale ambiguity:The difficulty in determining the absolute size and distance of an object from a single 2D image.
- Occlusion:How occluded objects can affect the accuracy of 3D bounding box prediction.
- Viewpoint variation:The impact of different viewing angles on the perceived shape and size of objects.
- Computational resources:Consider the computational complexity and real-time requirements for potential applications.

Design and Development of an Oriented 3D Bounding Box CNN:

Based on the literature review and the identified system constraints design a novel or adapt an existing single-shot object detection CNN architecture to predict oriented 3D bounding boxes.
This will involve:
- Choosing an appropriate backbone network.
- Designing the output layers to predict the parameters of the 3D bounding box (e.g. center coordinates dimensions Euler angles or quaternions for orientation).
- Defining a suitable loss function that incorporates the different aspects of 3D bounding box prediction.

Training and Evaluation on Real-World Data:

Select a suitable real-world dataset(s) with 3D object annotations (both Kapsch proriatory and public).
Implement and train the model
Evaluate the performance of the trained model using appropriate 3D object detection metrics (e.g. Average Precision with different IoU thresholds in 3D space).
Analyze the results identify limitations and discuss potential future improvements.

Expected Deliverables:

CNNs.
A detailed description of the designed and implemented model architecture.
A thorough evaluation of the models performance on real-world data.
A written thesis document summarizing the research process findings and conclusions.
Potentially a working implementation of the developed model.

Your Profile:

Required Background Studies in Computer Science Software Engineering Information Technology Geoinformatics or related fields
Fluent English skills
Interest in technology
Willingness and ability to work independently
Excellent communication and teamwork skills
Conscientiousness and reliability
Strong analytical skills with a precise and structured approach
Start: Immediately
Duration: 36 months
Successful completion of the masters thesis will be rewarded with 3000.

Contact:

Edwin Frühwirth

Key Skills

Animal Care
Focus
Facilities Management
Advertisement
Maintenance

Apply Now

About Company

Kapsch TrafficCom AG

130 Jahre Erfahrung: Kapsch ist Österreichs führender Partner für Digitalisierung und intelligente Mobilität. Jetzt in eine erfolgreiche Zukunft starten!

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click