Position: GCP Big Data Engineer
Location: Philadelphia PA(Onsite)
Exp : 8
Client: UMG
Rate: $62 on W2 and $70 on C2C
Net 60 days payment terms
Job Description:
We are seeking a highly skilled Senior GCP Big Data Engineer to join our team and work on cutting-edge data solutions for our client UMG. The ideal candidate will have extensive experience in designing building and optimizing large-scale data pipelines on Google Cloud Platform (GCP) using PySpark Apache Beam Dataflow and BigQuery. You will play a key role in processing and analyzing massive datasets to drive business insights.
Key Responsibilities:
- Design develop and optimize scalable Big Data pipelines using GCP services (Dataflow BigQuery Pub/Sub Cloud Storage).
- Implement real-time and batch data processing using Apache Beam Spark and PySpark.
- Work with Kafka for event streaming and data integration.
- Orchestrate workflows using Airflow for scheduling and monitoring data pipelines.
- Write efficient Java/Python code for data processing and transformation.
- Optimize BigQuery performance including partitioning clustering and query tuning.
- Collaborate with data scientists and analysts to enable advanced analytics and machine learning pipelines.
- Ensure data reliability quality and governance across pipelines.
- Leverage Google Analytics and GFO (Google for Organizations) where applicable.
Required Skills & Experience:
- 8 years of hands-on experience in Big Data engineering.
- Strong expertise in GCP (Google Cloud Platform) Dataflow BigQuery Pub/Sub Cloud Storage.
- Proficiency in PySpark Spark and Java/Scala for Big Data processing.
- Experience with Apache Beam for unified batch/streaming pipelines.
- Solid understanding of Kafka for real-time data streaming.
- Hands-on experience with Airflow for workflow orchestration.
- Strong SQL skills and optimization techniques in BigQuery.
- Experience with distributed computing and performance tuning.
Good to Have:
- Knowledge of GFO (Google for Organizations).
- Familiarity with Google Analytics for data integration.
- Experience with DataProc Cloud Composer or Dataprep.
- Understanding of CI/CD pipelines for Big Data applications.
gcp