DescriptionHealthPartners is currently hiring for a Data Engineer Shared Services.Our mission is to make healthcare simple and affordable. At HealthPartners teams use data to enhance patient and member experiences improve health outcomes and reduce the per capita cost of care. Data engineers are essential to this mission. They design build and optimize data pipelines that ensure reliable and efficient data movement. Their work supports high data quality and integrity enabling better decision-making across the organization. They collaborate in scrum teams with developers analysts and data scientists often sharing responsibilities to meet sprint goals. They follow industry best practices and develop scalable processes for storing managing and delivering data. In their role data engineers focus on reducing manual data tasks and increasing productivity. They explore and test innovative tools techniques and architectures to identify patterns and automate repetitive data preparation and integration tasks.
Required Qualifications:
- Bachelors degree in computer science data or social science operations research statistics applied mathematics econometrics or a related quantitative field. Alternate experience and education in equivalent areas such as economics engineering or physics is acceptable.
- Two (2) years experience in a hands-on data engineering role (a masters degree is acceptable in lieu of experience)
- Two (2) years experience with Python and/or R data science programming languages
- Two (2) years experience with SQL (e.g. PL/SQL or PySpark SQL) relational database programming language(s).
- Experience with CI/CD and version control tools (Git preferred).
- Demonstrate understanding of data modeling techniques such as Star-/Snowflake-Schema denormalized data modeling 3NF etc.
- Demonstrate understanding working with data formats such as Parquet Avro Delta CSV JSON etc.
- Demonstrate understanding about data processing techniques like full-batch processing time-based partitioning distributed- and real-time processing etc.
- Demonstrate strong data profiling and analytic skills; ability to discover and highlight unique patterns/trends within data to identify and solve complex problems.
- Must be motivated self-driven curious and creative.
- Must be a skilled communicator and demonstrate an ability to work with end users and partners.
- Demonstrate the ability to support and complement the work of a diverse development and/or operations team.
Preferred Qualifications:
- Experience with Oracle ERP
- Knowledge of health care operations
- Knowledge/experience of basic accounting principles
- Exposure to Agile/Scrum
- Experience with a hybrid cloud environment consisting of an on-premises and public cloud infrastructure. An ideal candidate will have experience with one or more of the following skill sets.
- Experience with Relational databases like Oracle SQL Server
- Experience Optimizing and tuning SQL/Oracle queries stored procedures and triggers.
- Experience with Python (numpy pandas matplotlib etc.) and Jupyter notebooks for exploratory data analysis machine learning and process automation
- Experience in areas of CI/CD continuous testing and site reliability engineering.
- Familiarity in Microsoft Azure applications such as Azure Data Factory Synapse Purview Databricks /Spark Power BI PowerApps.
- Familiarity working with Document or NoSQL datastores particularly MongoDB.
- Familiarity in Power BI data models using advanced Power Query and DAX
- Interest and desire to contribute to emerging practices around DataOps (CI/CD IaC configuration management etc.)
Hours/Location:
- M-F; Days
- May work in a remote capacity but will prefer local/regional candidates for occasional onsite needs.
Responsibilities:
- All team members must champion and model our values of partnership curiosity compassion integrity and excellence and must contribute to a culture of continuous learning.
- Collaborate with stakeholders data scientists and analysts to frame problems clean and integrate data and determine the best way to provision that data on demand.
- Collaborate with other developers to design technology solutions that achieve measurable results at scale.
- Help design and develop scalable efficient data pipeline processes to manage data ingestion cleansing transformation integration and validation required to provide access to prepared data sets for analysts and data scientists.
- Utilize development best practices including technical design reviews implementing test plans monitoring/alerting peer code reviews and documentation.
- Collaborate with cross functional team to resolve data quality and operational issues and ensure timely delivery of products.
- Incorporate core data management competencies including data governance data security and data quality.
- Participate in requirements gathering sessions with business and technical staff to distill technical requirements from business requests.
- Assist Identify design and implement internal process improvements: automating manual processes optimizing data delivery re-designing infrastructure for greater scalability.
- Perform other duties as required to meet team sprint goals.