This is a remote position.
We are seeking an AI Data Engineer to design and build production-grade data pipelines that power machine learning systems. This role focuses on creating scalable ingestion transformation and feature engineering workflows that support model training evaluation and real-time inference.
You will work closely with Data Scientists Machine Learning Engineers and Platform teams to ensure high-quality reliable and efficient data flows across cloud environments. The ideal candidate understands both traditional data engineering and the unique data needs of ML systems.
Key Responsibilities:
Design and build scalable data pipelines for ML workflows
Develop feature engineering and data preparation processes
Implement batch and real-time data ingestion systems
Ensure data quality validation and monitoring
Collaborate with ML engineers to support model training and deployment
Integrate pipelines with orchestration tools (Airflow or similar)
Optimize pipeline performance and cloud cost efficiency
Maintain documentation and version control of data workflows
Requirements
Requirements
4 years of experience in Data Engineering
Strong Python and SQL skills
Experience building data pipelines for ML or analytics systems
Hands-on experience with Spark Databricks or similar distributed processing frameworks
Experience with orchestration tools (Airflow or similar)
Experience in AWS Azure or GCP environments
Familiarity with data quality validation and monitoring frameworks
Understanding of feature engineering and model data lifecycle
Preferred Qualifications:
Experience with streaming systems (Kafka Kinesis Pub/Sub)
Experience supporting model deployment and MLOps workflows
Experience with feature stores or vector databases
Familiarity with ML frameworks (TensorFlow PyTorch)
Required Skills:
Experience with AWS Azure or GCP Strong knowledge of networking security and cloud services Experience with Terraform ARM or CloudFormation Familiarity with containers and orchestration tools Strong troubleshooting and optimization skills
This is a remote position. We are seeking an AI Data Engineer to design and build production-grade data pipelines that power machine learning systems. This role focuses on creating scalable ingestion transformation and feature engineering workflows that support model training evaluation and real...
This is a remote position.
We are seeking an AI Data Engineer to design and build production-grade data pipelines that power machine learning systems. This role focuses on creating scalable ingestion transformation and feature engineering workflows that support model training evaluation and real-time inference.
You will work closely with Data Scientists Machine Learning Engineers and Platform teams to ensure high-quality reliable and efficient data flows across cloud environments. The ideal candidate understands both traditional data engineering and the unique data needs of ML systems.
Key Responsibilities:
Design and build scalable data pipelines for ML workflows
Develop feature engineering and data preparation processes
Implement batch and real-time data ingestion systems
Ensure data quality validation and monitoring
Collaborate with ML engineers to support model training and deployment
Integrate pipelines with orchestration tools (Airflow or similar)
Optimize pipeline performance and cloud cost efficiency
Maintain documentation and version control of data workflows
Requirements
Requirements
4 years of experience in Data Engineering
Strong Python and SQL skills
Experience building data pipelines for ML or analytics systems
Hands-on experience with Spark Databricks or similar distributed processing frameworks
Experience with orchestration tools (Airflow or similar)
Experience in AWS Azure or GCP environments
Familiarity with data quality validation and monitoring frameworks
Understanding of feature engineering and model data lifecycle
Preferred Qualifications:
Experience with streaming systems (Kafka Kinesis Pub/Sub)
Experience supporting model deployment and MLOps workflows
Experience with feature stores or vector databases
Familiarity with ML frameworks (TensorFlow PyTorch)
Required Skills:
Experience with AWS Azure or GCP Strong knowledge of networking security and cloud services Experience with Terraform ARM or CloudFormation Familiarity with containers and orchestration tools Strong troubleshooting and optimization skills