We are seeking an experienced Data Engineer to lead the design development and optimization of end-to-end data pipelines and cloud-based solutions. You will be responsible for architecting scalable data and analytic systems ensuring data integrity and implementing software engineering best practices and patterns. The ideal candidate has a strong background in ETL big data technologies and cloud services with a proven ability to drive complex projects from concept to production.
This job description in no way states or implies that these are the only duties to be performed by the teammate occupying this position. The selected candidate may perform other related duties assigned to meet the ongoing needs of the business.
Data Architecture and Engineering
Design and implement scalable data pipelines for data ingestion transformation and storage.
Architect and optimize data lakes and data warehouses to support analytics and reporting needs.
Develop robust ETL processes to integrate structured and unstructured data from diverse sources.
Ensure high data quality through cleaning validation and transformation techniques.
Cloud and Big Data Solutions
Lead the implementation of big data frameworks such as Hadoop and Spark for processing large datasets.
Develop and optimize solutions on cloud platforms including AWS S3 Azure Data Lake Google BigQuery and Snowflake.
Manage data lakes to facilitate efficient data access and processing for downstream applications.
Database and Data Warehousing
Design implement and manage relational (SQL) and non-relational (NoSQL) database systems.
Lead database architecture efforts including schema design query optimization and performance tuning.
Oversee the design and management of data warehouses ensuring reliability scalability and security.
Software Development and Automation
Utilize Python and SQL to develop efficient production-ready code for data pipelines and integrations.
Implement scripting automation using Bash and PowerShell to streamline workflows.
Leverage version control (Git) and follow best practices in code optimization unit testing and debugging.
Collaboration and Leadership
Act as a technical leader providing guidance on best practices.
Collaborate with cross-functional teams (Data Scientists Software Engineers Analysts) to meet business objectives.
Drive innovation by evaluating and integrating emerging tools technologies and frameworks.
Establish and maintain CI/CD pipelines to ensure efficient deployment and system reliability.
Required:
Bachelors or Masters degree in Computer Science Data Engineering or a related field.
5 years of experience as a Data Engineer with expertise in building large-scale data solutions.
Proficiency in Python SQL and scripting languages (Bash PowerShell).
Deep understanding of big data tools (Hadoop Spark) and ETL processes.
Hands-on experience with cloud platforms (AWS S3 Azure Data Lake Google BigQuery Snowflake).
Strong knowledge of database systems (SQL NoSQL) database design and query optimization.
Experience designing and managing data warehouses for performance and scalability.
Proficiency in software engineering practices: version control (Git) CI/CD pipelines and unit testing.
Preferred
Strong experience in software architecture design patterns and code optimization.
Expertise in Python-based pipelines and ETL frameworks.
Experience with Azure Data Services and Databricks.
Excellent problem-solving analytical and communication skills.
Experience working in agile environments and collaborating with diverse teams.