Data / ML Platform Engineer (Python LLM Pipelines Batch Processing)
Job Summary
We are looking for a Data / ML Platform Engineer to design and manage Python-based batch pipelines for evaluating machine learning and LLM models.
The role involves building scalable pipelines that process large datasets interact with LLM APIs and store structured outputs for analytics and benchmarking.
Key Responsibilities
ML / Data Pipeline Development
- Build and maintain Python batch pipelines for:
- Model evaluation
- Dataset processing
- Design resumable and fault-tolerant workflows
LLM Integration
- Integrate with:
- GPT / Claude / other LLM APIs
- Execute large-scale evaluation workflows
Data Engineering
- Load and process data from:
- Cloud storage (S3 GCS Azure Blob)
- Transform and structure data for analytics
Database & Storage
- Store results in:
- MongoDB (structured outputs)
- Ensure efficient data retrieval and storage
Performance & Scalability
- Optimize:
- Pipeline execution
- Cost and latency
- Handle high-volume batch jobs
Cloud & DevOps
- Deploy pipelines on:
- Use:
- CI/CD pipelines
- Version control (Git)
Monitoring & Reliability
- Implement:
- Logging and monitoring
- Error handling & retries
- Ensure pipeline stability
Required Skills
Core Skills
- Strong Python development
- Experience with:
- Batch pipelines / ETL
- Data processing
AI / ML
- Hands-on with:
- LLM APIs (GPT Claude)
- Prompt engineering
- Experience with:
- RAG pipelines (preferred)
Data & Storage
- MongoDB / NoSQL databases
- Data structuring and transformation
Cloud
- AWS / GCP / Azure
- Cloud storage services
DevOps
- Git / GitHub
- CI/CD pipelines
Experience Required
- 5-8 years total experience
- 2 years in ML / data pipelines / LLM-based systems
Data / ML Platform Engineer (Python LLM Pipelines Batch Processing) Job Summary We are looking for a Data / ML Platform Engineer to design and manage Python-based batch pipelines for evaluating machine learning and LLM models. The role involves building scalable pipelines that process large dataset...
Data / ML Platform Engineer (Python LLM Pipelines Batch Processing)
Job Summary
We are looking for a Data / ML Platform Engineer to design and manage Python-based batch pipelines for evaluating machine learning and LLM models.
The role involves building scalable pipelines that process large datasets interact with LLM APIs and store structured outputs for analytics and benchmarking.
Key Responsibilities
ML / Data Pipeline Development
- Build and maintain Python batch pipelines for:
- Model evaluation
- Dataset processing
- Design resumable and fault-tolerant workflows
LLM Integration
- Integrate with:
- GPT / Claude / other LLM APIs
- Execute large-scale evaluation workflows
Data Engineering
- Load and process data from:
- Cloud storage (S3 GCS Azure Blob)
- Transform and structure data for analytics
Database & Storage
- Store results in:
- MongoDB (structured outputs)
- Ensure efficient data retrieval and storage
Performance & Scalability
- Optimize:
- Pipeline execution
- Cost and latency
- Handle high-volume batch jobs
Cloud & DevOps
- Deploy pipelines on:
- Use:
- CI/CD pipelines
- Version control (Git)
Monitoring & Reliability
- Implement:
- Logging and monitoring
- Error handling & retries
- Ensure pipeline stability
Required Skills
Core Skills
- Strong Python development
- Experience with:
- Batch pipelines / ETL
- Data processing
AI / ML
- Hands-on with:
- LLM APIs (GPT Claude)
- Prompt engineering
- Experience with:
- RAG pipelines (preferred)
Data & Storage
- MongoDB / NoSQL databases
- Data structuring and transformation
Cloud
- AWS / GCP / Azure
- Cloud storage services
DevOps
- Git / GitHub
- CI/CD pipelines
Experience Required
- 5-8 years total experience
- 2 years in ML / data pipelines / LLM-based systems
View more
View less