Data Engineer

Soft Source Inc

Not Interested
Bookmark
Report This Job

profile Job Location:

Houston, MS - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Overview:
Delivers the Palantir Foundry exit on a modern Snowflake stack by building reliable performant and testable ELT pipelines; recreates Foundry transformations and rule-based event logic; and ensures historical data extraction reconciliation and cutover readiness.
Years of Experience:
7 years overall; 3 years hands-on with Snowflake.
Key Responsibilities:
  • Extract historical datasets from Palantir (dataset export parquet) to S3/ADLS and load into Snowflake; implement checksum and reconciliation controls.
  • Rebuild Foundry transformations as dbt models and/or Snowflake SQL; implement curated schemas and incremental patterns using Streams and Tasks.
  • Implement the batch event/rules engine that evaluates time-series plus reference data on a schedule (e.g. 30 60 minutes) and produces auditable event tables.
  • Configure orchestration in Airflow running on AKS and where appropriate Snowflake Tasks; monitor alert and document operational runbooks.
  • Optimize warehouses queries clustering and caching; manage cost with Resource Monitors and usage telemetry.
  • Author automated tests (dbt tests Great Expectations or equivalent) validate parity versus legacy outputs and support UAT and cutover.
  • Collaborate with BI/analytics teams (Sigma Power BI) on dataset contracts performance and security requirements.
Required Qualifications:
  • Strong Snowflake SQL and Python for ELT utilities and data validation.
  • Production experience with dbt (models tests macros documentation lineage).
  • Orchestration with Airflow (preferably on AKS/Kubernetes) and use of Snowflake Tasks/Streams for incrementals.
  • Proficiency with cloud object storage (S3/ADLS) file formats (Parquet/CSV) and bulk/incremental load patterns (Snowpipe External Tables).
  • Version control and CI/CD with GitHub/GitLab; environment promotion and release hygiene.
  • Data quality and reconciliation fundamentals including checksums row/aggregate parity and schema integrity tests.
  • Performance and cost tuning using query profiles micro-partitioning behavior and warehouse sizing policies.
Preferred Qualifications:
  • Experience migrating from legacy platforms (Palantir Foundry Cloudera/Hive/Spark) and familiarity with Trino/Starburst federation patterns.
  • Time-series data handling and rules/pattern detection; exposure to Snowpark or UDFs for complex transforms.
  • Familiarity with consumption patterns in Sigma and Power BI (Import DirectQuery composite models RLS/OLS considerations).
  • Security and governance in Snowflake (RBAC masking row/column policies) tagging and cost allocation.
  • Exposure to containerized workloads on AKS lightweight apps for surfacing data (e.g. Streamlit) and basic observability practices.
Overview: Delivers the Palantir Foundry exit on a modern Snowflake stack by building reliable performant and testable ELT pipelines; recreates Foundry transformations and rule-based event logic; and ensures historical data extraction reconciliation and cutover readiness. Years of Experience: 7 years...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala