We are seeking an experienced Cloud Data Engineer with a strong background in AWS Azure and GCP. The ideal candidate will have extensive experience with cloud-native ETL tools such as AWS DMS AWS Glue Kafka Azure Data Factory GCP Dataflow and other ETL tools like Informatica SAP Data Intelligence etc. You will be responsible for designing implementing and maintaining robust data pipelines and building scalable data lakes. Experience with various data platforms like Redshift Snowflake Databricks Synapse Snowflake and others is essential. Familiarity with data extraction from SAP or ERP systems is a plus.
Key Responsibilities:
Design and Development:
- Design develop and maintain scalable ETL pipelines using cloud-native tools (AWS DMS AWS Glue Kafka Azure Data Factory GCP Dataflow etc..
- Architect and implement data lakes and data warehouses on cloud platforms (AWS Azure GCP).
- Develop and optimize data ingestion transformation and loading processes using Databricks Snowflake Redshift BigQuery and Azure Synapse.
- Implement ETL processes using tools like Informatica SAP Data Intelligence and others.
- Develop and optimize data processing jobs using Spark Scala.
Data Integration and Management:
- Integrate various data sources including relational databases APIs unstructured data and ERP systems into the data lake.
- Ensure data quality and integrity through rigorous testing and validation.
- Perform data extraction from SAP or ERP systems when necessary.
Performance Optimization:
- Monitor and optimize the performance of data pipelines and ETL processes.
- Implement best practices for data management including data governance security and compliance.
Collaboration and Communication:
- Work closely with data scientists analysts and other stakeholders to understand data requirements and deliver solutions.
- Collaborate with cross-functional teams to design and implement data solutions that meet business needs.
Documentation and Maintenance:
- Document technical solutions processes and workflows.
- Maintain and troubleshoot existing ETL pipelines and data integrations.
Education:
- Bachelor s degree in Computer Science Information Technology or a related field. Advanced degrees are a plus.
Experience:
- 7 years of experience as a Data Engineer or in a similar role.
- Proven experience with cloud platforms: AWS Azure and GCP.
- Hands-on experience with cloud-native ETL tools such as AWS DMS AWS Glue Kafka Azure Data Factory GCP Dataflow etc.
- Experience with other ETL tools like Informatica SAP Data Intelligence etc.
- Experience in building and managing data lakes and data warehouses.
- Proficiency with data platforms like Redshift Snowflake BigQuery Databricks and Azure Synapse.
- Experience with data extraction from SAP or ERP systems is a plus.
- Strong experience with Spark and Scala for data processing.
Skills:
- Strong programming skills in Python Java or Scala.
- Proficient in SQL and query optimization techniques.
- Familiarity with data modeling ETL/ELT processes and data warehousing concepts.
- Knowledge of data governance security and compliance best practices.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
Preferred Qualifications:
- Experience with other data tools and technologies such as Apache Spark or Hadoop.
- Certifications in cloud platforms (AWS Certified Data Analytics Specialty Google Professional Data Engineer Microsoft Certified: Azure Data Engineer Associate).
- Experience with CI/CD pipelines and DevOps practices for data engineering
- Selected applicant will be subject to a background investigation which will be conducted and the results of which will be used in compliance with applicable law.
databricks,aws,spark,bigquery,informatica,aws glue,redshift,scala,azure synapse,snowflake,python,sap,aws dms,azure,cloud,sql,gcp dataflow,data,etl,azure data factory,gcp,sap data intelligence,java,kafka