Job Description:
As a Data Engineer you will play a critical role in the development implementation and maintenance of data infrastructure and systems. Your primary responsibility will be to design build and optimize data pipelines and data warehouses ensuring the efficient and reliable collection storage and processing of large volumes of data. You will collaborate with cross-functional teams including data scientists analysts and software engineers to understand data requirements and translate them into scalable solutions. Your work will enable the organization to extract valuable insights drive data-based decision-making and support various business initiatives.
Responsibilities:
- Design develop and maintain data pipelines and ETL processes to efficiently ingest transform and load data from various sources into data warehouses and data lakes.
- Collaborate with data scientists analysts and business stakeholders to understand data requirements and design data models that facilitate efficient data retrieval and analysis.
- Optimize data pipeline performance ensuring scalability reliability and data integrity.
- Implement data governance and security measures to ensure compliance with data privacy regulations and protect sensitive information.
- Identify and implement appropriate tools and technologies to enhance data engineering capabilities and automate processes.
- Conduct thorough testing and validation of data pipelines to ensure data accuracy and quality.
- Monitor and troubleshoot data pipelines to identify and resolve issues ensuring minimal downtime.
- Develop and maintain documentation including data flow diagrams technical specifications and user guides.
- Collaborate with software engineers and infrastructure teams to optimize data infrastructure including storage processing and retrieval systems.
- Stay up-to-date with emerging trends and technologies in the field of data engineering and recommend innovative solutions to improve efficiency and performance.
Requirements:
- Bachelors degree in Computer Science Engineering or a related field. A masters degree is a plus.
- Proven experience as a Data Engineer or in a similar role with a strong understanding of data engineering concepts practices and tools.
- Proficiency in programming languages such as Python Java or Scala and experience with data manipulation and transformation frameworks/libraries (e.g. Apache Spark Pandas SQL).
- Solid understanding of relational databases data modeling and SQL queries.
- Experience with distributed computing frameworks such as Apache Hadoop Apache Kafka or Apache Flink.
- Knowledge of cloud platforms (e.g. AWS Azure GCP) and experience with cloud-based data engineering services (e.g. Amazon Redshift Google BigQuery Azure Data Factory).
- Familiarity with data warehousing concepts and technologies (e.g. dimensional modeling columnar databases).
- Strong problem-solving skills and the ability to analyze complex data-related issues.
- Excellent communication and collaboration skills with the ability to work effectively in cross-functional teams.
- Attention to detail and a commitment to delivering high-quality work within specified timelines.