As a Data Engineer you will design develop and maintain data solutions that facilitate data generation collection and processing. Your typical day will involve creating data pipelines ensuring data quality and implementing ETL processes to effectively migrate and deploy data across various systems contributing to the overall efficiency and reliability of data operations. Roles & Responsibilities: Good Grasp on batches and workflows Handson knowledge of PYSPARK batch and workflow management Code & API deployments from lower environment to production Runbook and code artefacts review Attend and understand deployment KT Handle workflow failures (5 10 daily) debug issues and rerun jobs Optimize workflows for performance and resource utilization Analyze and optimize workflows by addressing bottlenecks in workflows. Monitor job latencies and provide recommendations for optimization Monitor and optimize data pipelines for performance and & Technical Skills: - Must To Have Skills: Proficiency in PySpark.- Good To Have Skills: Experience with Apache Kafka.- Strong understanding of data warehousing concepts and architecture.- Familiarity with cloud platforms such as AWS or Azure.- Experience in SQL and NoSQL databases. Additional Information: - The candidate should have minimum 3 years of experience in PySpark.- This position is based at our Pune office.- A 15 years full time education is required.
As a Data Engineer you will design develop and maintain data solutions that facilitate data generation collection and processing. Your typical day will involve creating data pipelines ensuring data quality and implementing ETL processes to effectively migrate and deploy data across various systems c...
As a Data Engineer you will design develop and maintain data solutions that facilitate data generation collection and processing. Your typical day will involve creating data pipelines ensuring data quality and implementing ETL processes to effectively migrate and deploy data across various systems contributing to the overall efficiency and reliability of data operations. Roles & Responsibilities: Good Grasp on batches and workflows Handson knowledge of PYSPARK batch and workflow management Code & API deployments from lower environment to production Runbook and code artefacts review Attend and understand deployment KT Handle workflow failures (5 10 daily) debug issues and rerun jobs Optimize workflows for performance and resource utilization Analyze and optimize workflows by addressing bottlenecks in workflows. Monitor job latencies and provide recommendations for optimization Monitor and optimize data pipelines for performance and & Technical Skills: - Must To Have Skills: Proficiency in PySpark.- Good To Have Skills: Experience with Apache Kafka.- Strong understanding of data warehousing concepts and architecture.- Familiarity with cloud platforms such as AWS or Azure.- Experience in SQL and NoSQL databases. Additional Information: - The candidate should have minimum 3 years of experience in PySpark.- This position is based at our Pune office.- A 15 years full time education is required.
View more
View less