Audio Data Infrastructure Engineer

Berlin - Germany

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

About us

ai-coustics is building the reliability layer for Voice AI the system that closes the gap between raw audio input and reliable machine understanding in production. By combining state-of-the-art speech and audio research with real-time production-grade SDKs we test observe and enable Voice AI systems to work in any environment.
Our software is used by Voice AI companies across Europe and the United States whose products require reliable performance at scale: call center agents voice agents telephony apps and enterprise voice assistants. We believe voice will become the main interface for technology and ai-coustics is building the foundational infrastructure to make audio input reliable measurable and easy to deploy.

We are backed by leading early-stage investors including Connect Ventures Partech Inovia Capital as well as angel investors from HuggingFace DeepMind and Amazon with deep expertise in AI and developer infrastructure. These partners share our vision and are helping us build a world-class team operating with high levels of responsibility and velocity. We look for people who take ownership think systemically and want to solve challenging real-world problems in close collaboration with our customers. If youre motivated by developing technology that is used in practice shaping an emerging category and setting a new standard for how Voice AI works in the real world youll feel at home at ai-coustics.

Role overview

Were looking for an Audio Data Infrastructure Engineer to design and maintain a robust scalable data pipeline that transforms raw audio from diverse sources into structured analytics at scale.

In this role youll own the database architecture high-volume ingestion pipelines and analysis and labeling workflows that process many terabytes of audio. This includes ingesting raw audio running large-scale ML- and DSP-based analysis and storing the resulting metadata and analytics efficiently in a large PostgreSQL database.

Your work underpins our training evaluation and analysis workflows and requires careful attention to performance correctness and fault tolerance across ingestion processing and storage layers. The role is on-site in Berlin.

Tasks

Architect and maintain a large-scale PostgreSQL database optimized for analytical workloads.
Design scalable ingestion pipelines for audio data from many sources.
Build distributed compute pipelines for ML inference on audio frames.
Design and maintain efficient metadata storage for audio frames statistics and analysis results.
Optimize ETL/ELT pipelines for performance reliability and scalability.
Ensure idempotent fault-tolerant workflows across ingestion and analysis.
Work closely with ML and backend teams to integrate new models and analytics.

Requirements

3 years of experience in Data Engineering ML Infrastructure or Distributed Systems working on production systems at scale.
Deep experience with PostgreSQL at scale including schema design partitioning indexing and high-throughput bulk loading.
Experience building and operating reliable ETL pipelines using tools such as Airflow Prefect Dagster or custom frameworks.
Strong Python engineering skills including async processing multiprocessing and large-scale batch workflows.
Experience processing very large datasets on the order of hundreds of millions of rows or TB-scale files with efficient storage and access patterns.
Practical familiarity with audio data as a modality including common processing tools (e.g. FFmpeg) and an understanding of how audio artifacts and preprocessing choices affect downstream analysis.
Experience running ML inference pipelines at scale to label classify or structure large datasets with a realistic understanding of what modern ML models can and cannot reliably infer.
A startup mindset: Youre comfortable with ambiguity take ownership of complex systems and make pragmatic decisions in a fast-moving product-driven environment. Prior startup or similarly dynamic experience is a strong plus.

Benefits

Opportunity to work at a rapidly growing Voice AI startup backed by top investors.
Compensation and equity: Competitive salary package additional benefits and stock options enabling you to take part in the companys success.
Startup Culture: Dynamic fast-paced environment with passionate and collaborative colleagues.
High Impact: Groundbreaking startup at a pivotal growth stage making a real difference in how people experience audio.
Ownership & Autonomy: Take full ownership of projects and ship fast.
Work With the Best: World-class team of engineers and builders with ample room for professional growth.
Contribute to the Future: Define the landscape of Voice AI technology.

If you are ready to lead the charge in revolutionizing Voice AI and drive our startup to new heights we would love to hear from you. Apply today to join the ai-coustics team!

About usai-coustics is building the reliability layer for Voice AI the system that closes the gap between raw audio input and reliable machine understanding in production. By combining state-of-the-art speech and audio research with real-time production-grade SDKs we test observe and enable Voice AI...

About us

Role overview

Tasks

Architect and maintain a large-scale PostgreSQL database optimized for analytical workloads.
Design scalable ingestion pipelines for audio data from many sources.
Build distributed compute pipelines for ML inference on audio frames.
Design and maintain efficient metadata storage for audio frames statistics and analysis results.
Optimize ETL/ELT pipelines for performance reliability and scalability.
Ensure idempotent fault-tolerant workflows across ingestion and analysis.
Work closely with ML and backend teams to integrate new models and analytics.

Requirements

3 years of experience in Data Engineering ML Infrastructure or Distributed Systems working on production systems at scale.
Deep experience with PostgreSQL at scale including schema design partitioning indexing and high-throughput bulk loading.
Experience building and operating reliable ETL pipelines using tools such as Airflow Prefect Dagster or custom frameworks.
Strong Python engineering skills including async processing multiprocessing and large-scale batch workflows.
Experience processing very large datasets on the order of hundreds of millions of rows or TB-scale files with efficient storage and access patterns.
Practical familiarity with audio data as a modality including common processing tools (e.g. FFmpeg) and an understanding of how audio artifacts and preprocessing choices affect downstream analysis.
Experience running ML inference pipelines at scale to label classify or structure large datasets with a realistic understanding of what modern ML models can and cannot reliably infer.
A startup mindset: Youre comfortable with ambiguity take ownership of complex systems and make pragmatic decisions in a fast-moving product-driven environment. Prior startup or similarly dynamic experience is a strong plus.

Benefits

Opportunity to work at a rapidly growing Voice AI startup backed by top investors.
Compensation and equity: Competitive salary package additional benefits and stock options enabling you to take part in the companys success.
Startup Culture: Dynamic fast-paced environment with passionate and collaborative colleagues.
High Impact: Groundbreaking startup at a pivotal growth stage making a real difference in how people experience audio.
Ownership & Autonomy: Take full ownership of projects and ship fast.
Work With the Best: World-class team of engineers and builders with ample room for professional growth.
Contribute to the Future: Define the landscape of Voice AI technology.

If you are ready to lead the charge in revolutionizing Voice AI and drive our startup to new heights we would love to hear from you. Apply today to join the ai-coustics team!

Key Skills

Jenkins
Ruby
Python
Active Directory
Cloud
PowerShell
Windows
AWS
Linux
SAN
Java
Troubleshoot
Backup
Puppet
hardware

Apply Now

About Company

Ai-coustics

ai-coustics is a Berlin-based startup pioneering Generative Audio AI technology. Were on a mission to redefine the way people experience speech quality and intelligibility in real-time communication and media content by providing solutions to our millions of people across various vert ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Audio Data Infrastructure Engineer

Berlin - Germany

Job Summary

Tasks

Requirements

Benefits

Tasks

Requirements

Benefits

Key Skills

About Company

Related Jobs