Job Title: GCP Data Engineer
Location: Remote
Duration: Long-term
Experience Required: Minimum 5 years
Working Hours: 1:30 PM – 10:30 PM IST
Dual Employment: Not allowed (must be terminated immediately if applicable)
Budget: 1.3 LPM
About the Project
The vendor will be migrating data from the Teradata database to Google BigQuery. Existing jobs built in BTEX and Informatica will be converted to GCP Dataflow jobs.
The Data Engineer will be responsible for:
Key Responsibilities
Design develop and maintain scalable ETL/ELT pipelines for structured and unstructured data.
Optimize data storage and retrieval using BigQuery and GCP Dataflow.
Work with cross-functional teams to define data architecture and model design.
Support data validation quality assurance and troubleshooting of pipeline issues.
Collaborate with the DevOps team to automate deployments and implement CI/CD pipelines.
Ensure data security and compliance following organizational and GCP best practices.
Core Technical Skills
Strong proficiency in SQL Python PySpark and Spark SQL for data manipulation and transformation.
Deep understanding of data modeling normalization and schema design.
Experience in data lake data warehouse and data mart design and management.
Knowledge of batch and streaming data processing architectures.
GCP Tools & Services Expertise
BigQuery: Data warehouse design optimization and cost management.
Dataproc: Running Spark/Hadoop jobs.
Cloud Data Fusion: Designing developing and deploying data pipelines.
Cloud Dataflow (Apache Beam): Scalable stream and batch data processing.
Cloud Storage: Data lake setup and object lifecycle management.
Cloud Composer (Airflow): Workflow orchestration.
DevOps & Infrastructure Skills
Cloud IAM: Role-based access control and security best practices.
VPC Firewall and KMS: Network configuration and data encryption.
Terraform / Deployment Manager: Infrastructure as Code (IaC).
CI/CD Pipelines: Automation for build testing and deployment processes.
Data Governance & BI Tools
Implement data quality lineage and cataloging using tools such as Data Catalog.
Integrate with Looker Data Studio or other BI tools for visualization and reporting.
Support compliance auditing and data access transparency.
Soft Skills & Best Practices
Strong analytical problem-solving and communication skills.
Ability to collaborate with cross-functional and international teams.
Experience working in Agile environments.
Understanding of cost optimization monitoring and SLOs/SLAs.
Background Verification (BGV) Process
Other Requirements
Job Title: GCP Data EngineerLocation: RemoteDuration: Long-termExperience Required: Minimum 5 yearsWorking Hours: 1:30 PM – 10:30 PM ISTDual Employment: Not allowed (must be terminated immediately if applicable)Budget: 1.3 LPMAbout the ProjectThe vendor will be migrating data from the Teradata datab...
Job Title: GCP Data Engineer
Location: Remote
Duration: Long-term
Experience Required: Minimum 5 years
Working Hours: 1:30 PM – 10:30 PM IST
Dual Employment: Not allowed (must be terminated immediately if applicable)
Budget: 1.3 LPM
About the Project
The vendor will be migrating data from the Teradata database to Google BigQuery. Existing jobs built in BTEX and Informatica will be converted to GCP Dataflow jobs.
The Data Engineer will be responsible for:
Key Responsibilities
Design develop and maintain scalable ETL/ELT pipelines for structured and unstructured data.
Optimize data storage and retrieval using BigQuery and GCP Dataflow.
Work with cross-functional teams to define data architecture and model design.
Support data validation quality assurance and troubleshooting of pipeline issues.
Collaborate with the DevOps team to automate deployments and implement CI/CD pipelines.
Ensure data security and compliance following organizational and GCP best practices.
Core Technical Skills
Strong proficiency in SQL Python PySpark and Spark SQL for data manipulation and transformation.
Deep understanding of data modeling normalization and schema design.
Experience in data lake data warehouse and data mart design and management.
Knowledge of batch and streaming data processing architectures.
GCP Tools & Services Expertise
BigQuery: Data warehouse design optimization and cost management.
Dataproc: Running Spark/Hadoop jobs.
Cloud Data Fusion: Designing developing and deploying data pipelines.
Cloud Dataflow (Apache Beam): Scalable stream and batch data processing.
Cloud Storage: Data lake setup and object lifecycle management.
Cloud Composer (Airflow): Workflow orchestration.
DevOps & Infrastructure Skills
Cloud IAM: Role-based access control and security best practices.
VPC Firewall and KMS: Network configuration and data encryption.
Terraform / Deployment Manager: Infrastructure as Code (IaC).
CI/CD Pipelines: Automation for build testing and deployment processes.
Data Governance & BI Tools
Implement data quality lineage and cataloging using tools such as Data Catalog.
Integrate with Looker Data Studio or other BI tools for visualization and reporting.
Support compliance auditing and data access transparency.
Soft Skills & Best Practices
Strong analytical problem-solving and communication skills.
Ability to collaborate with cross-functional and international teams.
Experience working in Agile environments.
Understanding of cost optimization monitoring and SLOs/SLAs.
Background Verification (BGV) Process
Other Requirements
View more
View less