Data Platform Engineer
Job Summary
Responsibilities
1. Cloud Migration & Platform Modernization
- Lead migration projects from legacy Hadoop/Cloudera platforms to Cloud Native environments
- Assess existing Big Data architecture and define migration strategy
- Perform component mapping between on-premise tools and cloud-managed services
- Design migration roadmap cutover plan rollback plan and disaster recovery strategy
- Optimize cloud infrastructure cost scalability availability and performance
- Support hybrid-cloud and multi-cloud architecture
2. Streaming & Data Engineering
- Design and develop real-time streaming pipelines using Apache Kafka and Apache Flink
- Build large-scale batch and streaming data processing using Apache Spark
- Implement Data Lakehouse architecture (Bronze / Silver / Gold)
- Optimize ETL/ELT workloads for performance and reliability
- Support CDC (Change Data Capture) and event-driven architecture
3. Data Transformation & Workflow Orchestration
- Develop and maintain ELT transformation workflows using dbt
- Design workflow orchestration pipelines using Apache Airflow
- Implement CI/CD pipelines for data platform deployment
- Automate infrastructure provisioning and operational tasks
4. Query Engine & Analytics Platform
- Design distributed query architecture using Trino
- Implement high-performance analytics platform using Apache Doris
- Support BI and dashboard platforms using Redash and Imply
- Optimize query performance and federated query architecture
5. AI/ML & MLOps Platform Support
- Support Data Science and AI platform environments using:
- Jupyter Notebook
- MLflow
- Label Studio
- Langfuse
- Support GPU-based AI/ML workloads
- Collaborate with Data Scientists and AI Engineers
6. Infrastructure & Operations
- Deploy and manage Big Data workloads on Kubernetes and Cloud platforms
- Configure monitoring logging backup and disaster recovery
- Implement security best practices including IAM encryption and compliance controls
- Troubleshoot production issues and perform performance tuning
- Work closely with DevOps DBA Security and Application teams
Qualification:
Education
- Bachelors degree or higher in Computer Science Information Technology Engineering or related fields
Experience
- Minimum 5 years experience in Big Data Data Engineering or Cloud Platform roles
- Minimum 3 years experience in Cloud Migration or Platform Modernization projects
- Experience migrating Hadoop/Cloudera ecosystems to cloud platforms
- Experience working in enterprise-scale environments
Technical Skills
- Strong hands-on experience with:
- Apache Kafka
- Apache Flink
- Apache Spark
- dbt
- Apache Airflow
- Trino
- Apache Doris
- Strong SQL and Python programming skills
- Experience with Linux and Shell Script
- Experience with Kubernetes and Docker
- Understanding of Data Lakehouse Architecture
- Experience with Cloud platforms:
Preferred Skills
- Experience with:
- Redash
- Imply
- MLflow
- Label Studio
- Langfuse
- Experience with Terraform / Ansible
- Experience with CI/CD pipelines
- Knowledge of Data Governance and Security
- Experience with GPU or AI infrastructure
Soft Skills
- Strong analytical and problem-solving skills
- Good communication and presentation skills
- Strong stakeholder management skills
- Ability to work independently and as a team
- Leadership and mentoring capability
- Ability to work under pressure and manage multiple projects
Certifications (Optional)
- OCI Cloud Certification
- AWS Cloud Certification
- Kubernetes Certification (CKA/CKAD)
- Kafka / Spark / Databricks Certifications
Required Experience:
Senior IC