Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailLocation:Karachi Lahore Islamabad (Hybrid)
Experience:5 Years
Job Type: Full-Time
We are looking for a highly skilled and experienced Data Engineer with a strong foundation in Big Data distributed computing and cloud-based data solutions. This role demands a strong understanding of end-to-end Data pipelines data modeling and advanced data engineering practices across diverse data sources and environments. You will play a pivotal role in building deploying and optimizing data infrastructure and pipelines in a scalable cloud-based architecture.
Design develop and maintain large-scale Data pipelines using modern big data technologies and cloud-native tools.
Build scalable and efficient distributed data processing systems using Hadoop Spark Hive and Kafka.
Work extensively with cloud platforms (preferably AWS) and services like EMR Glue Lambda Athena S3.
Design and implement data integration solutions pulling from multiple sources into a centralized data warehouse or data lake.
Develop pipelines using DBT (Data Build Tool) and manage workflows with Apache Airflow or Step Functions.
Write clean maintainable and efficient code using Python PySpark or Scala for data transformation and processing.
Build and manage relational and columnar data stores such as PostgreSQL MySQL Redshift Snowflake HBase ClickHouse.
Implement CI/CD pipelines using Docker Jenkins and other DevOps tools.
Collaborate with data scientists analysts and other engineering teams to deploy data models into production.
Drive data quality integrity and consistency across systems.
Participate in Agile/Scrum ceremonies and utilize JIRA for task management.
Provide mentorship and technical guidance to junior team members.
Contribute to continuous improvement by making recommendations to enhance data engineering processes and architecture.
5 years of hands-on experience as a Data Engineer
Deep knowledge of Big Data technologies Hadoop Spark Hive Kafka.
Expertise in Python PySpark and/or Scala.
Proficient with data modeling SQL scripting and working with large-scale datasets.
Experience with distributed storage like HDFS and cloud storage (e.g. AWS S3).
Hands-on with data orchestration tools like Apache Airflow or StepFunction.
Experience working in AWS environments with services such as EMR Glue Lambda Athena.
Familiarity with data warehousing concepts and experience with tools like Redshift Snowflake (preferred).
Exposure to tools like Informatica AbInitio Apache Iceberg is a plus.
Knowledge of Docker Jenkins and other CI/CD tools.
Strong problem-solving skills initiative and a continuous learning mindset.
Bachelors or Masters degree in Computer Science Data Engineering or related field.
Experience with open table formats such as Apache Iceberg.
Hands-on with AbInitio (GDE Collect > IT) or Informatica tools.
Knowledge of Agile methodology working experience in JIRA.
Self-driven proactive and a strong team player.
Excellent communication and interpersonal skills.
Passion for data and technology innovation.
Ability to work independently and manage multiple priorities in a fast-paced environment.
Required Experience:
Manager
Full-Time