We are looking for a certified Data Engineer who will turn data into information information into insight and insight into business decisions. This is a unique opportunity to be one of the key drivers of change in our expanding company.
About the Customer
The customer is a leading provider of software solutions and healthcare services. They provide hospitals and health systems and help those systems generate better data and insights to enable better healthcare.
The customer offers a robust portfolio of solutions that can be fully integrated and customconfigured to help healthcare organizations efficiently capture information for assessing and improving the quality and safety of patient care.
Requirements
- Background in Data Engineering with handson work using Databricks
- Strong expertise in ETL processes and data pipeline development
- Proficiency with Spark and Python for largescale data processing in Databricks
- Experience with data extraction from webbased sources (APIs web scraping or similar approaches)
- Familiarity with handling structured and unstructured data and ensuring efficient data storage and access
- Competency in Azure or other cloud platforms
- Knowledge of SQL and database management for data validation and storage
English level
UpperIntermediate
Responsibilities
- Design develop and implement data ingestion and transformation pipelines using Databricks and Azure Databricks
- Manage and orchestrate data extraction from a public information website handling bulk data downloads efficiently
- Develop solutions for periodic data updates (monthly) optimizing data refresh cycles to ensure accuracy and efficiency
- Clean aggregate and transform the data for analysis ensuring the quality and completeness of data
- Collaborate with stakeholders to understand data requirements and propose solutions for efficient data management
- Implement best practices for ETL processes in a Databricks environment ensuring scalability and performance
- Monitor and troubleshoot data pipelines ensuring smooth operation and timely updates.
- Document the data engineering processes workflows and architecture