Job Summary (List Format):
- 100% remote position
- Experience in healthcare or logistics domains is highly desired
- Design build and maintain data pipelines using Python and PySpark (3 years)
- Develop and manage workflows with Airflow or similar orchestration tools (2 years)
- Work extensively with Google Cloud Platform services including BigQuery GCS Pub/Sub Cloud Run Functions and Cloud SQL (3 years)
- Implement real-time data ingestion using Kafka webhooks and file-based methods (2 years)
- Integrate APIs via REST/webhooks (2 years)
- Deploy and manage Kubernetes environments preferably GKE (1 2 years)
- Write and optimize queries in BigQuery SQL and PostgreSQL (2 years)
- Design YAML/config-driven data pipelines (2 years)
- Perform schema transformation hashing and data quality framework tasks (2 years)
- Contribute to CI/CD pipelines observability and develop lightweight dashboards using Grafana Streamlit or Flask UI (1 year)
- Bonus experience in PostgreSQL CI/CD monitoring dashboarding or lightweight UI development