AI Data Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Lansing, IL - USA

profile Monthly Salary: Not Disclosed
Posted on: 13 hours ago
Vacancies: 1 Vacancy

Job Summary

AI Data Engineer

Mode of work: Hybrid
Location: Okemos Michigan

Job Summary:

We are looking for a Senior AI Data Engineer with strong expertise in AI/ML data pipelines model data preparation vector databases and cloud-based AI infrastructure. The candidate will architect build and optimize the data ecosystem that powers machine learning generative AI and LLM applications. You ll collaborate closely with Data Scientists ML Engineers and Cloud Architects to ensure high-quality production-ready data for AI models.

Core Responsibilities:

  • Design and build end-to-end AI data pipelines for training fine-tuning and inference of ML/Gen-AI models.
  • Create data ingestion transformation and feature-engineering workflows optimized for large-scale model training.
  • Develop and manage data lakes feature stores and vector databases (e.g. Pinecone Weaviate FAISS).
  • Collaborate with ML Engineers to operationalize model deployment pipelines (using MLflow Kubeflow or SageMaker Pipelines).
  • Integrate unstructured data (text image audio sensor data) for AI-ready datasets.
  • Implement data labeling versioning and lineage tracking for reproducible AI experiments.
  • Optimize performance for large-scale distributed training on Spark Databricks or Ray.
  • Ensure data quality compliance and governance for AI systems under SOC 2 / GDPR / HIPAA frameworks.
  • Partner with architects to design cloud-native AI infrastructure (AWS SageMaker Azure AI GCP Vertex AI).

Required Skills & Experience:

  • 7 years in Data Engineering with 3 years focused on AI/ML data workflows.
  • Expert in Python (pandas PySpark FastAPI) and SQL.
  • Hands-on with data orchestration (Airflow Prefect Dagster) and ETL tools (Databricks Glue Dataflow).
  • Proficient in cloud AI services AWS SageMaker Azure Machine Learning or GCP Vertex AI.
  • Experience with vector databases (FAISS Pinecone ChromaDB) and embedding pipelines (OpenAI API LangChain).
  • Knowledge of model data lifecycle: training evaluation deployment monitoring.
  • Solid understanding of ML concepts: supervised/unsupervised learning transformers and LLM fine-tuning.
  • Experience in MLOps and CI/CD for AI models (Docker Kubernetes GitHub Actions Terraform).
  • Strong command of data privacy security and governance for AI datasets.

Preferred Skills:

  • Familiarity with Gen-AI tools (LangChain LLamaIndex OpenAI API Anthropic Claude Hugging Face).
  • Knowledge of data retrieval prompt-engineering pipelines (RAG).
  • Experience integrating LLM applications with production APIs.
  • Cloud certifications in AI/ML or Data Engineering.
AI Data Engineer Mode of work: Hybrid Location: Okemos Michigan Job Summary: We are looking for a Senior AI Data Engineer with strong expertise in AI/ML data pipelines model data preparation vector databases and cloud-based AI infrastructure. The candidate will architect build and optimize the data ...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala