Meeting Notes:
-Experience with Matillion preferred. Extensive experience with similar ETL with tools like SSIS and AWS Glue acceptable.
-Experience with Redshift Spectrum.
-Experience building out VPCs subnets and network routing on AWS.
-Experience building and enhancing Machine Learning Models in SageMaker and Python.
Cloud Data Engineer
Job Purpose
Design develop enhance and integrate an enterprise data warehouse and support AWS cloud infrastructure. Complete projects prioritize tasks and provide frequent progress reports with little to no assistance from team leads based on company needs. Responsible for implementing scalable solutions that align with our data governance standards and architectural roadmap for data integrations data storage reporting and analytic solutions.
Responsibilities and Duties
The following statements describe the general nature and level of work being performed by people assigned to this classification. These are not to be construed as an exhaustive list of all job duties.
- Build and support components and infrastructures of the data warehouse in Amazon Redshift. Develop ETL/ELT jobs in Matillion that gather and process raw data from multiple disparate sources (including SQL text files JSON and APIs) into a form suitable for data warehousing data marts models reports and machine learning. Maintain and create AWS Lambda functions using Python.
- Support the hands-on technical delivery of services and capabilities within the AWS environment; Manage the AWS Cloud infrastructure in particular compute and storage services (EC2 S3) identify and troubleshoot faults and provide recommendations to ensure proper utilization.
- Gather and interpret ongoing business requirements; research required data. Collaborate with Business stakeholders Subject Matter Experts and the Quality Assurance team to ensure valid and proper deliverables.
- Participate in Data Quality initiatives and lead the data transformation component design to improve and maintain high-quality data. Support performance tuning and design for optimal system performance for large volumes of data.
- Code and write unit/functional tests for new or modified data systems
- Create supporting technical and functional documentation including data flow diagrams and provide support for the existing data dictionaries and metadata tables for managing data. Explain tasks and processes clearly and concisely that can be consumed and understood by non-technical business associates.
- Assists knowledge sharing and mentoring other employees with fewer skill sets to promote best practices and foster growth.
- Acts as a stand-in for the team lead on projects and meetings when necessary. Complete code reviews of other team members.
- Supports business decisions with ad-hoc analysis as needed.
- Participate in the rotation of after-hours support.
- Additional duties and responsibilities may be assigned.
Education and Qualifications
- Bachelors degree in a related discipline (Computer Science Information Systems Management Engineering or similar) or equivalent work experience.
- 5 years of experience implementing on-premise or cloud-based data lake/data warehouse technologies in Redshift.
- 5 years of experience working with high-volume data and building batch and near real-time ETL processes
- 3 years of experience with AWS Cloud services such as EC2 S3 Lambda and API Gateway.
- Understand general concepts of data modeling.
- Experience SQL Server and reporting solutions such as SSRS (Reporting Services)/Power BI Dashboards.
- Ability to quickly identify and troubleshoot problematic faults in data pipelines and infrastructure.
- Ability to design tables models data marts and/or databases to suit business needs.
- Knowledge of data mapping data integration database design and data warehouse concepts.
- Experience using Jira Bitbucket and GitKraken with a good understanding of Git.
- Desire and ability to learn emerging technologies and methodologies.
- Ability to interpret requests and requirements to build appropriate automated solutions.
- Ability to work with shifting deadlines in a fast-paced environment.
- Strong computer skills including Microsoft Office (Visio Excel Word & Outlook).
- Strong attention to detail interpersonal skills and notation skills.
Preferred Skills
- Experience with Matillion preferred. Extensive experience with similar ETL with tools like SSIS and AWS Glue acceptable.
- Experience with Redshift Spectrum.
- Experience building out VPCs subnets and network routing on AWS.
- AWS Certifications.
- Experience building and enhancing Machine Learning Models in SageMaker and Python.