Data Engineer (Azure | Databricks | Snowflake)

TekWissen LLC

Not Interested
Bookmark
Report This Job

profile Job Location:

Overland Park, KS - USA

profile Monthly Salary: Not Disclosed
Posted on: 4 hours ago
Vacancies: 1 Vacancy

Job Summary

Overview:
TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation information technology and services
Position: Data Engineer (Azure Databricks Snowflake)
Location: Frisco TX and Overland Park KS
Duration: 6 Months
Job Type: Temporary Assignment
Work Type: Hybrid
JOB SUMMARY:
  • Needed for Azure-native third party data enrichment platform using Databricks/Spark Snowflake; focus on reliable governed pipelines strong Spark troubleshooting privacy/governance and cost-aware engineering;
Team / Business Context:
  • You will join a data engineering team responsible for third party data enrichment augmenting first party datasets with external identity/attribute data to support analytics activation and research.
  • The enriched datasets are consumed by multiple downstream systems and teams including the Customer Data Platform (CDP) and other analytics/research stakeholders.
  • The platform is Azure-native and built primarily on Databricks (processing some ML workloads) and Snowflake (analytics/warehouse).
  • A major focus is building reliable governed vendor agnostic datasets while ensuring privacy/compliance data governance and cost efficiency.
Key Responsibilities
  • As a Data Engineer you will: Data Ingestion & Pipeline Development Build and enhance ingestion pipelines for large batch and event-driven paths (streaming may evolve over time).
  • Integrate data from: Third party enrichment vendors (identity attributes very large volumes) Digital platforms via Conversion API (CAPI) integrations (through intermediary/middleware) Rewards/Promotions systems (e.g. TMT) for offer issuance/redemption/consumption data
  • Data Quality Reliability & Operations Implement strong data validation idempotency replay/backfill strategies and deduplication to prevent quality drift.
  • Own monitoring alerting dashboarding and operational readiness ( wrappers around core pipelines).
  • Troubleshoot failures with root cause analysis not just reruns: Interpret Spark logs Diagnose performance issues (shuffle skew partitioning) Improve stability and SLA adherence Governance & Compliance (First-class NFR) Apply privacy compliance and governance requirements across pipelines and datasets.
  • Support governance standards such as: Unity Catalog lineage access controls Managing PII vs non PII access Documentation of tables schemas catalogs and cluster usage
  • Cost Governance & Performance Optimization Design pipelines with cost awareness from day one: Cluster sizing workload tuning efficient compute/storage usage Trade-off decisions balancing cost vs quality vs SLA Collaboration & Ownership Work in a small fast-moving team; be self-driven and ownership-oriented.
  • Raise and manage data quality escalations when issues are detected.
  • Contribute to evolving architecture (product is early-stage; first live month was recent).
Must-Have Skills (Screening Keywords)
  • Candidate with hands-on recent experience in: Strong coding: PySpark SQL (hands-on not only orchestration)
  • Databricks: notebooks/jobs performance tuning fundamentals medallion patterns Spark fundamentals: partitioning skew/shuffle optimization understanding failures via logs
  • Snowflake: data modeling/usage for analytics/warehousing workloads Azure ecosystem: Azure Data Factory (ADF) (orchestration) Azure-native integrations and services exposure
  • Data engineering reliability patterns: validation idempotency replay/backfills dedup auditability Data governance: Unity Catalog (preferred) lineage access control patterns PII handling Ownership mindset: can execute independently without constant approvals/check-ins
Nice-to-Have Skills
  • Event-driven/streaming ingestion exposure (even if primary is batch today)
  • Delta/Databricks patterns such as Delta Live Tables (DLT) (some workflows exist)
  • Experience building config-driven export frameworks for multiple downstream consumers/vendors
  • Exposure/interest in identity resolution concepts (ML optional; ETL strength is priority)
  • Familiarity with CAPI integrations / marketing tech data signals
  • Experience implementing operational telemetry: dashboards alerts SLA monitoring
  • What Good Looks Like (Success Criteria) Ships reliable well-governed datasets with strong data quality practices
  • Can scale pipelines for very large volumes (hundreds of millions of records per vendor)
  • Prevents silent failures where quality degrades without obvious job failures
  • Balances delivery speed with compliance governance and cost controls
TekWissen Group is an equal opportunity employer supporting workforce diversity.
Overview: TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation information technology and services Position: Data Engin...
View more view more