Automate&nbspdata workflows&nbspusing&nbspApache Airflow&nbspor&nbspCloud Composer

Kochi - India

Monthly Salary: Not Disclosed

Posted on: 16 hours ago

Vacancies: 1 Vacancy

Job Summary

Job Description

Key Responsibilities

Design develop and optimizeETL pipelinesusingPySparkonGoogle Cloud Platform (GCP).

Work withBigQueryCloud DataflowCloud Composer (Apache Airflow) andCloud Storagefor data transformation and orchestration.

Develop and optimizeSpark-basedETL processes forlarge-scale data processing.

Implementbest practices for data governance security and monitoringin a cloud environment.

Collaborate withdata engineers analysts and business stakeholdersto understand data requirements.

Troubleshootperformance bottlenecksandoptimize Spark jobsfor efficient execution.

Automatedata workflowsusingApache AirfloworCloud Composer.

Ensuredata quality validation and consistencyacross pipelines.

5 yearsof experience inETL developmentwith a focus onPySpark.

Strong hands-on experience withGoogle Cloud Platform (GCP)services including:

BigQuery

Cloud Dataflow / Apache Beam

Cloud Composer (Apache Airflow)

Cloud Storage

Proficiency inPythonandPySparkforbig data processing.

Experience withdata lake architecturesanddata warehousingconcepts.

Knowledge ofSQLfor data querying and transformation.

Experience withCI/CD pipelinesfor data pipeline automation.

Strong debugging and problem-solving skills.

Experience withKafkaorPub/Subfor real-time data processing.

Knowledge ofTerraformfor infrastructure automation on GCP.

Experience withcontainerization (Docker Kubernetes).

Familiarity withDevOpsandmonitoring toolslike Prometheus Stackdriver or Datadog.

Skills:GcpPysparkEtl

Required Skills:

WORKFLOWSBIGQUERYKUBERNETESCLOUD STORAGEDOCKERAPACHE AIRFLOW