drjobs Google BigQuery PYSPARK Data Engineer

Google BigQuery PYSPARK Data Engineer

Employer Active

1 Vacancy
The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Dallas - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

  • GCP Data Engineer will create deliver and support custom data products as well as enhance/expand team capabilities.
  • They will work on analyzing and manipulating large datasets supporting the enterprise by activating data assets to support Enabling Platforms and analytics.
  • Google Cloud Data Engineers will be responsible for designing the transformation and modernization on Google Cloud Platform using GCP Services
Responsibilities:
  • Build data systems and pipelines on GCP Cloud using Data proc Data Flow Data Fusion Big query and Pub/Sub
  • Implement schedules/workflows and tasks for Cloud Composer/Apache Airflow.
  • Create and manage data storage solutions using GCP services such as BigQuery Cloud Storage and Cloud SQL
  • Monitor and troubleshoot data pipelines and storage solutions using GCPs Stackdriver and Cloud Monitoring
  • Develop efficient ETL/ELT pipelines and orchestration using Data Prep Google Cloud Composer
  • Develop and Maintain Data Ingestion and transformation process using Apache PySpark Dataflow
  • Automate data processing tasks using scripting languages such as Python or Bash Ensuring data security and compliance with industry standards by configuring IAM roles service accounts and access policies. Automating cloud deployments and infrastructure management using Infrastructure as Code (IaC) tools such as Terraform or Google Cloud Deployment Manager.
  • Participate in Code reviews contribute to development best practices and usage of Developer Assist tools to create a robust failsafe data pipelines
  • Collaborate with Product Owners Scrum Masters and Data Analyst to deliver the User Stories and Tasks and ensure deployment of pipelines
Experience required:
  • 5 years of application development experience required using one of the core cloud platforms viz. AWS Azure & GCP
  • Minimum 1 years of GCP experience. Experience working in GCP based Big Data deployments (Batch/RealTime) leveraging Big Query Big Table Google Cloud Storage Pub/Sub Data Fusion Dataflow Dataproc Airflow / Cloud Composer.
  • 2 years coding skills in Java/Python/PySpark and strong proficiency in SQL.
  • Work with data team to analyze data build models and integrate massive datasets from multiple data sources for data modeling.
  • Extracting Loading Transforming cleaning and validating data Designing pipelines and architectures for data processing.
  • Architecting and implementing next generation data and analytics platforms on GCP cloud. Experience in working with Agile and Lean methodologies.
  • Experience working with either a Map Reduce or an MPP system on any size/scale. Experience working in CI/CD model to ensure automated orchestration of pipelines.
  • Work Location:Dallas TX

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.