Job Description
The Cloud Data Engineer will be responsible for designing implementing end to end solutions in Azure Enterprise Data Lake and optimizing data pipelines ensuring the availability performance scalability and security of largescale data processing systems. This role requires deep understanding of big data technologies data architecture infrastructure CI/CD and data engineering best practices. Experience with Unity Catalog is a bonus. The Cloud Data Engineer will work closely with architects leads and other stakeholders to support data driven decision making processes.
Experience Required
- Minimum 8 years of experience with strong handson experience as a senior Data Engineer or related role
- 5 years of demonstrated experience in developing Big Data solutions that support business analytics and data science teams
- 35 years of proficient Data ingestion endtoend implementation of projects in Azure Enterprise Data Lake Azure Functions Databricks Blob Storage Cosmos DB Azure stream analytics Python SQL
- Extensive handson experience implementing Lake house architecture using Data bricks Data Engineering platform SQL Unix shell scripting SQL Analytics Delta Lake and Unity Catalog
- Good understanding of spark architecture with Databricks structured streaming setting up Azure with Databricks managing clusters in Databricks
- Experienced in DevOps and deployment automations with Azure DevOps ARM YAML Terraform
- Ability to research the latest trends and propose advanced tooling/solutions for Cloud Data Lake & Data Science platforms
- Experience with business intelligence and analytics tools such as OBIEE PowerBI or Tableau
- Collaborate applications teams/Business users to develop new pipelines with Cloud data migration methodologies and processes including tools like Azure Data Factory Event Hub etc
Roles & Responsibilities
- Drive and implement design of data schemas drive cloud data lake platform design decisions and development standards and maintain data pipelines for data ingestion processing and transformation in Azure.
- Drive analysis architecture design governance and development of data warehouse data lake and business intelligence solutions
- Using a combination of Azure Data factory Azure Blob Storage TSQL Pyspark and Azure Databricks should be able to Extract transform and load from sources system to Azure Data Storage services
- By ensuring data quality consistency and reliability Integrate data from various sources
- Define data requirements gather and mine large scale structured and unstructured data and validate data using various tools in a cloud environment.
- Manage and optimize Azure Enterprise data Lake to achieve efficient data storge and processing.
- Develop and optimize ETL processes using Databricks and related tools like Apache Spark
- Implementing data validation and cleansing procedures will ensure the quality integrity and dependability of the data.
Required Experience:
Senior IC