Responsible for developing implementing and operating stable scalable low cost solutions to source data from client systems into the Data Lake data warehouse and enduser facing BI applications. Responsible for ingestion transformation and integration of data to provide a platform that supports data analysis and enrichment as well as making data operationally available for analysis. The data Engineer will be a data pipeline builder data wrangler and support application developers database architects data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture and is consistent throughout our ongoingprojects. Essential Functions:
- Build and maintain scalable automated data pipelines. Support critical data pipelines with a highly scalable distributed architecture
- Have Knowledge on Saleforce API to pull data from Salesforce system to Databricks Delta tables
- including data ingestion from Salesforce data integration data curation.
- Need to build scalable and metadata driven framework to reuse notebooks/pipelines to move data from Salesforce to Databricks. Solution should work for future ingestions from Salesforce to Databricks data movement with minimal
- Deploy automate maintain and manage Azure cloudbased production system to ensure the availability performance scalability and security of productions systems.
- Good architectural understanding to build and ensure customer success when building new solutions and migrating existing data applications on Azure platform.
- Conduct full technical discovery identifying pain points business and technical requirements as is and to be scenarios.
- Design and arrangement of scalable highly attainable and fault tolerant systems on Azure platform.
- Ownership and responsibility for endtoend design and development testing release of key components.
- Understand and implement best practices in management of data including master data reference data metadata data quality and lineage. Experience with code versioning tools and a command of configuration management concepts and tools CICD including DevOps. Other duties as assigned.
Experience:
- Expert level SQL knowledge and experience.
- Expert level experience with Python/Pyspark/Scala and Object Oriented programming.
- Experience with streaming integration and cloudbased data processing systems such as; Kafka and Databricks.
- Handson knowledge of cloudbased data warehouse solutions like Azure & Snowflake.
- Experience with Azure cloud architecture and solutions.
- Experience with Azure Data lake store Blob Storage VMs Data Factory SQL Data Warehouse Azure Databricks HDInsight etc. Experience with data pipeline and workflow management tools: Azure Data Factory Airflow etc.
- Experience with Oracle Microsoft SQL Server Database system.
- Experience working within Agile methodologies.
- Experience with Microsoft Windows and Linux virtual servers. Moderate skill in Power BI.
Other Skills:
- Experience with healthcare data modeling standards such as HL7 or FHIR is preferred.
- Experience in processing healthcare data sets (medical records claims clinical data etc.) is preferred.
- Strong project management skills.
- Strong problemsolving decisionmaking and analytical skills.
- Excellent interpersonal and organizational skills a team player who can effectively partner with all levels of the company.
- Detail oriented and organized.
- Ability to handle numerous assignments simultaneously.
- Ability to work independently and as part of a team.
- Bachelors degree (BA or BS) from an accredited college or university plus a minimum of six (6) years of experience in the specific or related field.