Our Opportunity:
We are looking for a highly experienced and forward-thinking Data Engineer to join our Enterprise Data Systems team which drives reporting analytics and data science capabilities for numerous teams across the organization. In this key role you will be responsible for designing and building a state-of-the-art data platform that will serve as the foundation for data processing and analysis for the foreseeable future. The ideal candidate will bring extensive experience in developing large-scale data ingestion and processing frameworks coupled with a strong command of data engineering best practices while also providing strategic leadership and mentorship to the team.
Key Responsibilities:
- Lead the design development and optimization of scalable efficient and reusable data ingestion and processing frameworks using Spark and Spark Streaming.
- Architect implement and manage data pipelines using a variety of tools including Kafka Kinesis Airflow and AWS services such as Glue EMR S3 SQS SNS and Step Functions.
- Guide and mentor the data engineering team setting best practices technical guidelines and guardrails to ensure high-quality data solutions.
- Collaborate with cross-functional teams including data science and analytics to design and build data platforms that support the organizations evolving data needs.
- Ensure data quality observability and governance by implementing best-in-class practices for data auditing validation and monitoring.
- Hands-on experience working with data warehouse platforms such as Snowflake Redshift and relational databases like DynamoDB and PostgreSQL.
- Develop and manage orchestration pipelines using Airflow to optimize job scheduling and improve operational efficiency.
- Drive innovation by evaluating new data tools technologies and frameworks to enhance the companys data infrastructure.
- Implement and maintain CI/CD pipelines and infrastructure-as-code using Terraform for fast efficient deployments.
- Collaborate on cloud architecture decisions and drive the use of AWS services to manage large-scale data processing.
- Partner with stakeholders across the organization to understand their data needs and deliver reliable high-quality data products.
Qualifications:
- Proven experience as a Lead Data Engineer or Senior Data Engineer in a data-driven environment.
- Highly skilled and hands-on team member to design and develop data pipelines using Python Spark Spark streaming and Kafka for building large-scale data processing frameworks.
- Proficiency in cloud data platforms like AWS Glue EMR S3 SQS SNS and Step Functions.
- Strong working knowledge of Snowflake Redshift DynamoDB PostgreSQL and other data warehouses/databases.
- Experience with Airflow for orchestration and automation of data workflows.
- Deep understanding of data quality governance and observability principles.
- Knowledge of CI/CD pipelines and terraform for cloud infrastructure management.
- Strong knowledge of data platforms such as AWS Data platform Databricks Cloudera.
- Ability to work in a fast-paced dynamic environment with excellent problem-solving and leadership skills.
Nice to Have:
- Familiarity with Databricks or Cloudera platforms.
- Experience setting up and deploying data infrastructure using Terraform.
- Previous experience working with CI/CD pipelines to streamline deployment processes.
Why Join Us: You will be a key player in building a data platform that supports a data-driven organization impacting reporting analytics and machine learning. Youll work with cutting-edge tools and lead a high-performing team helping shape the future of data infrastructure within the company.