Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailJoin our innovative team at Mayo Clinic where we are shaping the future of healthcare through cuttingedge generative AI solutions. We are seeking a Senior Data Engineer with deep expertise in building scalable secure and highperformance data infrastructure to support the development and deployment of large language models (LLMs) and AIpowered applications. The ideal candidate will bring strong data engineering fundamentals proficiency in Python and Bash and advanced knowledge of Google Cloud Platform (GCP) services including BigQuery Dataflow Pub/Sub and Cloud Storage. A deep understanding of data quality frameworks is essentialthis includes designing and implementing strategies to ensure data accuracy completeness consistency uniqueness timeliness and validity throughout the data lifecycle. Experience with realtime and batch ETL pipelines data governance aligned with HIPAA standards and infrastructure automation using Terraform is critical. Familiarity with tools like Vertex AI and the ability to integrate machine learning workflows into robust data pipelines will be key to enabling our AIdriven research and clinical solutions. Experience with the Cloud Healthcare API and healthcare data standards such as FHIR HL7v2 and DICOM is a plus.
Develops and deploys data pipelines integrations and transformations to support analytics and machine learning applications and solutions as part of an assigned product team using various opensource programming languages and vended software to meet the desired design functionality for products and programs. The position requires maintaining an understanding of the organizations current solutions coding languages tools and regularly requires the application of independent judgment. May provide consultative services to departments/divisions and leadership committees. Demonstrated experience in designing building and installing data systems and how they are applied to the Department of Data & Analytics technology framework is required. Candidate will partner with product owners and Analytics and Machine Learning delivery teams to identify and retrieve data conduct exploratory analysis pipeline and transform data to help identify and visualize trends build and validate analytical models and translate qualitative and quantitative assessments into actionable insights.
During the selection process you may participate in a Codility test as well as an OnDemand (prerecorded) interview that youcan complete at your convenience. During the OnDemand interview a question will appear on yourscreen and you will have time to consider each question before responding. You will have the
opportunity to rerecord your answer to each question Mayo Clinic will only see the final recording. The complete interview will be reviewed by a Mayo Clinic staff member and you will be notified of nextsteps
This is a full time remote position within the United States. However the incumbent may be asked to work on campus 1 2 days per month therefore preference is that incumbent lives within a reasonable driving distance of a Mayo Clinic campus.
Mayo Clinic will not sponsor or transfer visas for this position including F1 OPT STEM.
A Bachelors degree in a relevant field such as engineering mathematics computer science information technology health science or other analytical/quantitative field and a minimum of five years of professional or research experience in data visualization data engineering analytical modeling techniques; OR an Associates degree in a relevant field such as engineering mathematics computer science information technology health science or other analytical/quantitative field and a minimum of seven years of professional or research experience in data visualization data engineering analytical modeling techniques. Indepth business or practice knowledge will also be considered.
Incumbent must have the ability to manage a varied workload of projects with multiple priorities and stay current on healthcare trends and enterprise changes. Interpersonal skills time management skills and demonstrated experience working on cross functional teams are required. Requires strong analytical skills and the ability to identify and recommend solutions and a commitment to customer service. The position requires excellent verbal and written communication skills attention to detail and a high capacity for learning and problem resolution.
Advanced experience in SQL is required. Strong Experience in scripting languages such as Python JavaScript PHP C or Java & API integration is required. Experience in hybrid data processing methods (batch and streaming) such as Apache Spark Hive Pig Kafka is required. Experience with big data statistics and machine learning is required. The ability to navigate linux and windows operating systems is required. Knowledge of workflow scheduling (Apache Airflow Google Composer) Infrastructure as code (Kubernetes Docker) CI/CD (Jenkins Github Actions) is preferred. Experience in DataOps/DevOps and agile methodologies is preferred. Experience with hybrid data virtualization such as Denodo is preferred. Working knowledge of Tableau Power BI SAS ThoughtSpot DASH d3 React Snowflake SSIS and Google Big Query is preferred.
Google Cloud Platform (GCP) certification is preferred.
Required Experience:
Senior IC
Full-Time