Data Engineering Lead

NuStar Technologies

Job Location:

New York City, NY - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Must Have Technical/Functional Skills

AWS Data Engineering Services (EMR/GlueRedshiftAurora S3 Lambda) Spark Python Collibra Snowflake/Databricks Tableau.

Roles & Responsibilities

Ingest and model data from APIs files/SFTP and relational sources; implement layered architectures (raw/clean/serving) using PySpark/SQL and dbt Python.

Design and operate pipelines with Prefect (or Airflow) including scheduling retries parameterization SLAs and well documented runbooks.

Build on cloud data platforms leveraging S3/ADLS/GCS for storage and a Spark platform (e.g. Databricks or equivalent) for compute; manage jobs secrets and access.

Publish governed data services and manage their lifecycle with Azure API Management (APIM) authentication/authorization policies versioning quotas and monitoring.

Enforce data quality and governance through data contracts validations/tests lineage observability and proactive alerting.

Optimize performance and cost via partitioning clustering query tuning job sizing and workload management.

Uphold security and compliance (e.g. PII handling encryption masking) in line with firm standards.

Collaborate with stakeholders (analytics AI engineering and business teams) to translate requirements into reliable production ready datasets.

Enable AI/LLM use cases by packaging datasets and metadata for downstream consumption integrating via Model Context Protocol (MCP) where appropriate.

Continuously improve platform reliability and developer productivity by automating routine tasks reducing technical debt and maintaining clear documentation.

4 15 years of professional data engineering experience.

Strong Python SQL and Spark (PySpark) skills and/or Kafka. Snowflake (Snowpipe Tasks Streams) as a complementary warehouse.

Databricks (Delta formats workflows cataloging) or equivalent Spark platforms.

Minimum 1 yr of experience in Data bricks (Hands-on).

Integrating datasets into MCP tools/providers for LLM/agent applications; familiarity with frameworks such as LangChain or LlamaIndex.

Hands-on experience building ETL/ELT with Prefect (or Airflow) dbt Spark and/or Kafka.

Experience onboarding datasets to cloud data platforms (storage compute security governance).

Familiarity with Azure/AWS/GCP data services (e.g. S3/ADLS/GCS; Redshift/BigQuery; Glue/ADF).

Git-based workflows CI/CD and containerization with Docker (Kubernetes a plus).

Generic Managerial Skills If any

Strategic Technical Leadership: Defining data architecture evaluating new technologies and setting technical standards for AWS-based pipelines

Stakeholder Communication: Bridging the gap between technical teams and business stakeholders gathering requirements and reporting progress

Risk Management: Proactively identifying potential bottlenecks in data workflows security risks or scalability issues

Operational Excellence: Implementing automation optimizing costs and maintaining high data quality standards.

TCS Employee Benefits Summary:

Discretionary Annual Incentive.

Comprehensive Medical Coverage: Medical & Health Dental & Vision Disability Planning & Insurance Pet Insurance Plans.

Family Support: Maternal & Parental Leaves.

Insurance Options: Auto & Home Insurance Identity Theft Protection.

Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.

Time Off: Vacation Time Off Sick Leave & Holidays.

Legal & Financial Assistance: Legal Assistance 401K Plan Performance Bonus College Fund Student Loan Refinancing.

Salary Range: $135000 - $150000 a year

Must Have Technical/Functional Skills AWS Data Engineering Services (EMR/GlueRedshiftAurora S3 Lambda) Spark Python Collibra Snowflake/Databricks Tableau. Roles & Responsibilities Ingest and model data from APIs files/SFTP and relational sources; implement layered architectures (raw/clea...