Data Engineer Architect

VDart Inc

Not Interested
Bookmark
Report This Job

profile Job Location:

Parsippany, NJ - USA

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

Data Engineer / Architect

Parsippany NJ (Onsite Hybrid 3 days a week) / Remote with 30% Travel

Fulltime

Job Description:

  • Design and develop scalable data pipelines using dbt Cloud Databricks and Apache Airflow to support enterprise analytics and reporting
  • Build and optimize Delta Lake-based data models to enable analytics-ready datasets
  • Implement advanced data modeling techniques including star schema fact/dimension design and SCD Type 1 & Type 2
  • Develop modular reusable and testable SQL-based transformations using DBT models macros and packages
  • Design and manage incremental data loading strategies ensuring efficient processing of large-scale datasets
  • Leverage Databricks SQL Spark and Delta Lake capabilities for high-performance data processing and optimization
  • Implement robust data quality checks and testing frameworks using DBT tests (e.g. not null unique referential integrity)
  • Collaborate with cross-functional teams including data engineers data scientists and BI teams to deliver business-driven data solutions
  • Integrate DBT pipelines with CI/CD workflows using Git-based version control and orchestrate jobs via Databricks Workflows or external schedulers
  • Ensure adherence to data governance security and compliance standards leveraging tools like Unity Catalog and enterprise policies.
  • Orchestrate end-to-end workflows using Airflow DAGs ensuring dependency management scheduling retries and fault tolerance

Technical Expertise

  • AWS & Cloud Architecture: Expert-level experience with AWS services (S3 RDS Bedrock agents) PostgreSQL and cloud-based data governance
  • Advanced Analytics: Regression analysis time-series forecasting multivariate analysis and classification models
  • MLOps & Deployment: Design and maintain model deployment monitoring and automated retraining pipelines
  • Simulation & Forecasting: Agent-based simulation for trial enrollment forecasting and scenario planning

Data & Analytics Capabilities

  • Feature Engineering: Extract insights from site performance historical enrollment and competitive landscape data
  • Model Evaluation: Build evaluation frameworks (AUC precision/recall) and optimize model granularity across disease/geography
  • Enterprise Data Integration: Merge internal (CTMS performance data) and external sources (Citeline epidemiological data)
  • Master Data Management: Create Golden ID datasets with data quality monitoring and continuous refresh capabilities

Experience Required

  • 5 years in pharmaceutical/clinical trial analytics
  • Focus on site selection and non-enrollment prediction
  • Proven track record with clinical operations data systems
Data Engineer / Architect Parsippany NJ (Onsite Hybrid 3 days a week) / Remote with 30% Travel Fulltime Job Description: Design and develop scalable data pipelines using dbt Cloud Databricks and Apache Airflow to support enterprise analytics and reporting Build and optimize Delta Lake-based ...
View more view more