Azure Data Engineer

Cary, MS - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Role Description:

Collect store process and analyze large datasets to build and implement extract transfer load (ETL) reusable frameworks to reduce the development effort involved thereby ensuring cost savings for the quality code with thought through performance optimizations in place right at the development to learn new technologies and be ready to work on new cutting-edge cloud with team spread across the globe in driving the delivery of projects and recommend development and performance improvementsBuilding and Implementing data ingestion and curation process developed using Big data tools such as Spark (ScalapythonJava) Hive HDFS Sqoop Kafka Kerberos Impala etc. and CDP huge volumes data from various platforms for Analytics needs and writing high-performance reliable and maintainable ETL code StrongStrong analytic skills related to working with unstructured datasetsStrong experience in building designing Data warehouses data stores for analytics consumption On prem and Cloud (real time as well as batch use cases). Proficiency and extensive Experience with Spark Scala Python and performance tuning is a MUSTHive database management and Performance tuning is a MUST. (Partitioning Bucketing )Strong SQL knowledge and data analysis skills for data anomaly detection and data quality assurance.

Essential Skills: Digital Microsoft Azure

Top 3 Required Skills:
1. Should have strong experience in BigQuery Big SQL Cloud Composer Dataproc Cloud Storage
2. Should have strong hands on experience in Azure platform using Azure Databricks Azure Data Factory (ADF) Azure Data Lake and Synapse
3. Should be familiar about relational data modeling and implementing Star/Snowflake schemas using build analytical data warehouses cocnepts

Top 3 Preferred Skills:
1. BigQuery Databricks Azure Synapse and Snowflake.
2. Pyspark / Spark kafka
3. Python Scala SQL MySQL.

Role Description: Collect store process and analyze large datasets to build and implement extract transfer load (ETL) reusable frameworks to reduce the development effort involved thereby ensuring cost savings for the quality code with thought through performance optimizations in place right ...