Role Overview: The Data Engineer will be a key builder on our AI journey responsible for designing constructing and maintaining the data infrastructure required to support our AI initiatives. This role will focus on building robust and scalable data pipelines to extract data from a variety of sources integrate it with our data lake/warehouse and prepare it for analysis by our Data Analysts and training custom AI models. This position is critical for enabling our focus on vendor-provided capabilities and eventually building custom solutions.
Key Responsibilities: Design build and maintain scalable and efficient ETL/ELT data pipelines to ingest data from internal and external sources (e.g. APIs from EPIC Workday relational databases flat files). and data warehouse to ensure data is clean accessible and ready for analysis and model training. Collaborate with the Data Analyst and other stakeholders to understand their data requirements and provide them with clean well-structured datasets. Implement data governance security and quality controls to ensure data integrity and compliance. Automate data ingestion transformation and validation processes. Work with our broader IT team to ensure seamless integration of data infrastructure with existing systems. Contribute to the evaluation and implementation of new data technologies and tools.
Required Skills & Qualifications: ETL/ELT Development: Strong experience in designing and building data pipelines using ETL/ELT tools and frameworks. SQL: Advanced proficiency in SQL for data manipulation transformation and optimization. Programming: Strong programming skills in Python (or a similar language) for scripting automation and data processing. Data Warehousing: Experience with data warehousing concepts and technologies. Cloud Computing: Hands-on experience with at least one major cloud platforms data services (e.g. Microsoft Azure Data Factory Azure Fabric IICS). Version Control: Proficiency with Git for code management and collaboration. Problem-Solving: Proven ability to troubleshoot and resolve data pipeline issues. Data Modeling: Experience with various data modeling techniques (e.g. dimensional modeling). Real-time Processing: Familiarity with real-time data streaming technologies (e.g. Kafka Azure Event Hubs). Education: Bachelors degree in Computer Science Engineering or related field.
Nice-to-Have Skills: API Integration: Experience building data connectors and integrating with APIs from major enterprise systems (e.g. EPIC Workday). CI/CD: Knowledge of Continuous Integration/Continuous Deployment practices for data pipelines. AI/ML MLOps: A basic understanding of the machine learning lifecycle and how to build data pipelines to support model training and deployment. Experience with Microsoft Fabric: Direct experience with Microsoft Fabrics integrated data platform (OneLake Data Factory Synapse Data Engineering).
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.