Senior Applied ML Engineer (Speech & Audio)

Nile Bits

موقع الوظيفة:

القاهرة - مصر

الراتب شهرياً: لم يكشف

تاريخ النشر: نُشرت قبل 14 ساعة

عدد الوظائف الشاغرة: 1 عدد الوظائف الشاغرة

سجل للتقديم

ملخص الوظيفة

Job Description

We are seeking a highly skilled Senior Applied Machine Learning Engineer with deep expertise in speech and audio this role you will design fine-tune and optimize advanced machine learning models for Arabic voice applications. You will work across the full development lifecycle from data pipeline construction and model experimentation to inference optimization and production deployment.

This position is ideal for engineers who are passionate about transforming cutting-edge research into scalable low-latency systems that support natural and accurate Arabic speech interactions.

Key Responsibilities

Benchmark and evaluate TTS and ASR models using Arabic-specific test sets measuring metrics such as Word Error Rate (WER) naturalness and dialect coverage.
Fine-tune generative models for voice cloning zero-shot speaker adaptation and speech synthesis.
Build and maintain Arabic-focused data pipelines including:
- Audio collection and preprocessing
- Diacritization (Tashkil)
- Data cleaning and augmentation
Optimize model inference for production environments using:
- Quantization
- KV-cache tuning
- Streaming inference techniques
Integrate and evaluate complete speech-to-speech conversational pipelines.
Conduct experiments based on recent research papers and convert findings into production-ready solutions.
Collaborate with engineering and product teams to deploy robust and scalable speech systems.

Qualifications :

Required Qualifications

5 years of experience in Machine Learning Applied AI or AI Research.
Strong programming skills in Python.
Extensive hands-on experience with PyTorch and the Hugging Face ecosystem.
Proven experience training and fine-tuning neural models for:
- Text-to-Speech (TTS)
- Automatic Speech Recognition (ASR)
- Audio codecs
Deep understanding of modern speech architectures such as:
- Whisper
- Conformer
- HiFi-GAN
- Diffusion-based models
Experience with audio processing techniques including:
- Voice Activity Detection (VAD)
- Speaker Diarization
- Neural Vocoders
Demonstrated ability to implement and adapt research papers into practical production experiments.
Strong understanding of Arabic language challenges including:
- Diacritization (Tashkil)
- Dialectal variations
- Code-switching
Experience with inference optimization techniques such as:
- Quantization
- Streaming inference
- NVIDIA TensorRT

Preferred Qualifications

Experience developing custom NVIDIA CUDA kernels for high-performance model inference.
Familiarity with speculative decoding and other advanced acceleration techniques.
Experience deploying models at scale in cloud or GPU-based production environments.
Contributions to open-source speech or machine learning projects.

Additional Information :

WHY YOULL LOVE US

All employees benefits for free (our famous games room daily breakfast fruits coffee and other hot drinks soft drinks and juices company days out and parties)
Social insurance
Open-door management policy
Full Medical insurance
Accommodation and Transportation Allowance
Friendly environment that values innovation and efficiency
Exciting opportunities for career growth and talent development
Feedback encouragement
Recognition and reward programs
Competitive salaries and incentives
Friendly environment
Flexible and Comfortable schedule
Fun committees
Monetary rewards
Fun smart and creative people
Career possibilities with growing team
Paid vacations
Social benefits

For more information about Nile Bits please visit our website:

Remote Work :

Yes

Employment Type :

Full-time

Job DescriptionWe are seeking a highly skilled Senior Applied Machine Learning Engineer with deep expertise in speech and audio this role you will design fine-tune and optimize advanced machine learning models for Arabic voice applications. You will work across the full development lifecycle from d...

Job Description

This position is ideal for engineers who are passionate about transforming cutting-edge research into scalable low-latency systems that support natural and accurate Arabic speech interactions.

Key Responsibilities

Benchmark and evaluate TTS and ASR models using Arabic-specific test sets measuring metrics such as Word Error Rate (WER) naturalness and dialect coverage.
Fine-tune generative models for voice cloning zero-shot speaker adaptation and speech synthesis.
Build and maintain Arabic-focused data pipelines including:
- Audio collection and preprocessing
- Diacritization (Tashkil)
- Data cleaning and augmentation
Optimize model inference for production environments using:
- Quantization
- KV-cache tuning
- Streaming inference techniques
Integrate and evaluate complete speech-to-speech conversational pipelines.
Conduct experiments based on recent research papers and convert findings into production-ready solutions.
Collaborate with engineering and product teams to deploy robust and scalable speech systems.

Qualifications :

Required Qualifications

5 years of experience in Machine Learning Applied AI or AI Research.
Strong programming skills in Python.
Extensive hands-on experience with PyTorch and the Hugging Face ecosystem.
Proven experience training and fine-tuning neural models for:
- Text-to-Speech (TTS)
- Automatic Speech Recognition (ASR)
- Audio codecs
Deep understanding of modern speech architectures such as:
- Whisper
- Conformer
- HiFi-GAN
- Diffusion-based models
Experience with audio processing techniques including:
- Voice Activity Detection (VAD)
- Speaker Diarization
- Neural Vocoders
Demonstrated ability to implement and adapt research papers into practical production experiments.
Strong understanding of Arabic language challenges including:
- Diacritization (Tashkil)
- Dialectal variations
- Code-switching
Experience with inference optimization techniques such as:
- Quantization
- Streaming inference
- NVIDIA TensorRT

Preferred Qualifications

Experience developing custom NVIDIA CUDA kernels for high-performance model inference.
Familiarity with speculative decoding and other advanced acceleration techniques.
Experience deploying models at scale in cloud or GPU-based production environments.
Contributions to open-source speech or machine learning projects.

Additional Information :

WHY YOULL LOVE US

All employees benefits for free (our famous games room daily breakfast fruits coffee and other hot drinks soft drinks and juices company days out and parties)
Social insurance
Open-door management policy
Full Medical insurance
Accommodation and Transportation Allowance
Friendly environment that values innovation and efficiency
Exciting opportunities for career growth and talent development
Feedback encouragement
Recognition and reward programs
Competitive salaries and incentives
Friendly environment
Flexible and Comfortable schedule
Fun committees
Monetary rewards
Fun smart and creative people
Career possibilities with growing team
Paid vacations
Social benefits

For more information about Nile Bits please visit our website:

Remote Work :

Yes

Employment Type :

Full-time

اعرض المزيد

قدم الآن

عن الشركة

Nile Bits

Nile Bits aims to provide the best software development services that deliver robust, scalable, and cost effective software solutions. A team of top class professionals offers you proven expertise to ensure the quality and reliability of the products we develop for you. We emphasize m ... اعرض المزيد

عرض صفحة الشركة عرض صفحة الشركة

التقديم التلقائي على الوظائف بـ AI

قدّم على عشرات الوظائف بنقرة واحدة