Role Summary
We are seeking a highly skilled Lead Data Engineer with deep hands-on experience in building large-scale data ingestion pipelines real-time streaming solutions and high-performance data processing systems. The ideal candidate excels at writing clean efficient code refactoring complex systems improving scalability and performance addressing production issues and delivering reliable data solutions across cloud platforms such as AWS Azure or GCP.
Key Responsibilities
- Design build and maintain high-volume data ingestion and processing pipelines for batch and real-time workloads.
- Implement and optimize real-time streaming pipelines using platforms such as Kafka.
- Develop scalable data solutions using Databricks PySpark Python and SQL.
- Perform core refactoring to modernize and optimize existing pipelines and data services.
- Build robust fault-tolerant pipelines capable of large-scale high-throughput data processing.
- Write unit tests automate validation and ensure high code quality and reliability.
- Integrate pipelines into CI/CD workflows to streamline and automate deployment processes.
- Identify troubleshoot and fix production issues ensuring system reliability and stability.
- Address performance bottlenecks and implement improvements for scalability throughput and efficiency.
- Work extensively across AWS Azure or GCP cloud environments and cloud-native data services.
- Design and orchestrate end-to-end pipelines using workflow and orchestration tools.
- Collaborate with Data Scientists and BI Engineers to deliver clean analytics-ready datasets.
- Communicate complex technical topics clearly to non-technical stakeholders.
Required Qualifications
- 8 years of hands-on data engineering experience building and maintaining large-scale data systems.
- Proven experience with high-volume data ingestion ETL/ELT and real-time data processing.
- Strong expertise with Kafka or similar streaming technologies.
- Advanced proficiency in Databricks PySpark Python and SQL.
- Experience with core refactoring improving code maintainability and system performance.
- Demonstrated ability to design and build scalable low-latency data pipelines.
- Strong skills in debugging performance optimization and pipeline tuning.
- Hands-on experience with at least one major cloud platform: AWS Azure or GCP.
- Experience writing unit tests and integrating solutions with CI/CD pipelines.
- Strong problem-solving skills and excellent communication abilities.
Role Summary We are seeking a highly skilled Lead Data Engineer with deep hands-on experience in building large-scale data ingestion pipelines real-time streaming solutions and high-performance data processing systems. The ideal candidate excels at writing clean efficient code refactoring complex ...
Role Summary
We are seeking a highly skilled Lead Data Engineer with deep hands-on experience in building large-scale data ingestion pipelines real-time streaming solutions and high-performance data processing systems. The ideal candidate excels at writing clean efficient code refactoring complex systems improving scalability and performance addressing production issues and delivering reliable data solutions across cloud platforms such as AWS Azure or GCP.
Key Responsibilities
- Design build and maintain high-volume data ingestion and processing pipelines for batch and real-time workloads.
- Implement and optimize real-time streaming pipelines using platforms such as Kafka.
- Develop scalable data solutions using Databricks PySpark Python and SQL.
- Perform core refactoring to modernize and optimize existing pipelines and data services.
- Build robust fault-tolerant pipelines capable of large-scale high-throughput data processing.
- Write unit tests automate validation and ensure high code quality and reliability.
- Integrate pipelines into CI/CD workflows to streamline and automate deployment processes.
- Identify troubleshoot and fix production issues ensuring system reliability and stability.
- Address performance bottlenecks and implement improvements for scalability throughput and efficiency.
- Work extensively across AWS Azure or GCP cloud environments and cloud-native data services.
- Design and orchestrate end-to-end pipelines using workflow and orchestration tools.
- Collaborate with Data Scientists and BI Engineers to deliver clean analytics-ready datasets.
- Communicate complex technical topics clearly to non-technical stakeholders.
Required Qualifications
- 8 years of hands-on data engineering experience building and maintaining large-scale data systems.
- Proven experience with high-volume data ingestion ETL/ELT and real-time data processing.
- Strong expertise with Kafka or similar streaming technologies.
- Advanced proficiency in Databricks PySpark Python and SQL.
- Experience with core refactoring improving code maintainability and system performance.
- Demonstrated ability to design and build scalable low-latency data pipelines.
- Strong skills in debugging performance optimization and pipeline tuning.
- Hands-on experience with at least one major cloud platform: AWS Azure or GCP.
- Experience writing unit tests and integrating solutions with CI/CD pipelines.
- Strong problem-solving skills and excellent communication abilities.
View more
View less