Data Engineer

New York City, NY - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Department:

Data Engineering

Job Summary

Data Engineering team is responsible for designing building and maintaining the Data Lake infrastructure including ingestion pipelines storage systems and internal tooling for reliable scalable access to market data.

Key Responsibilities

Ingestion&Pipelines: Architect batchstream pipelines (Airflow Kafka dbt) for diverse structured and unstructured marked data. Provide reusable SDKs in Python and Go for internal data producers.
Storage&Modeling: Implement and tune S3 columnoriented and timeseries data storage for petabytescale analytics; own partitioning compression TTL versioning and cost optimisation.
Tooling & Libraries: Develop internal libraries for schema management data contracts validation and lineage; contribute to shared libraries and services for internal data consumers for research backtesting and real-time trading purposes.
Reliability & Observability: Embed monitoring alerting SLAs SLOs and CI/CD; champion automated testing data quality dashboards and incident runbooks.
Collaboration: Partner with Data Science QuantResearch Backend and DevOps to translate requirements into platform capabilities and evangelise best practices.

Qualifications :

5 years of experience building and maintaining production-grade data systems with proven expertise in architecting and launching data lakes from scratch.
Expert-level Python development skills (Go and C nice to have).
Hands-on experience with modern orchestration tools (Airflow) and streaming platforms (Kafka).
Advanced SQL skills including complex aggregations window functions query optimization and indexing.
Experience designing high-throughput APIs (REST/gRPC) and data access libraries.
Solid fundamentals in Linux containerization (Docker) and cloud object storage solutions (AWS S3 GCS).
Strong knowledge of handling diverse data formats including structured and unstructured data with experience optimizing storage strategies such as partitioning compression and cost management.
Fluency in English for confident communication documentation and collaboration within an international team.

Additional Information :

What we offer:

Working in a modern international technology company without bureaucracy legacy systems or technical debt.
Excellent opportunities for professional growth and self-realization.
We work remotely from anywhere in the world with a flexible schedule.
We offer compensation for health insurance sports activities and professional training.

Remote Work :

Yes

Employment Type :

Full-time

Key Responsibilities

Ingestion&Pipelines: Architect batchstream pipelines (Airflow Kafka dbt) for diverse structured and unstructured marked data. Provide reusable SDKs in Python and Go for internal data producers.
Storage&Modeling: Implement and tune S3 columnoriented and timeseries data storage for petabytescale analytics; own partitioning compression TTL versioning and cost optimisation.
Tooling & Libraries: Develop internal libraries for schema management data contracts validation and lineage; contribute to shared libraries and services for internal data consumers for research backtesting and real-time trading purposes.
Reliability & Observability: Embed monitoring alerting SLAs SLOs and CI/CD; champion automated testing data quality dashboards and incident runbooks.
Collaboration: Partner with Data Science QuantResearch Backend and DevOps to translate requirements into platform capabilities and evangelise best practices.

Qualifications :

5 years of experience building and maintaining production-grade data systems with proven expertise in architecting and launching data lakes from scratch.
Expert-level Python development skills (Go and C nice to have).
Hands-on experience with modern orchestration tools (Airflow) and streaming platforms (Kafka).
Advanced SQL skills including complex aggregations window functions query optimization and indexing.
Experience designing high-throughput APIs (REST/gRPC) and data access libraries.
Solid fundamentals in Linux containerization (Docker) and cloud object storage solutions (AWS S3 GCS).
Strong knowledge of handling diverse data formats including structured and unstructured data with experience optimizing storage strategies such as partitioning compression and cost management.
Fluency in English for confident communication documentation and collaboration within an international team.

Additional Information :

What we offer:

Working in a modern international technology company without bureaucracy legacy systems or technical debt.
Excellent opportunities for professional growth and self-realization.
We work remotely from anywhere in the world with a flexible schedule.
We offer compensation for health insurance sports activities and professional training.

Remote Work :

Yes

Employment Type :

Full-time

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

BHFT

BHFT is a proprietary algorithmic trading firm. Our team manages the full trading cycle, from software development to creating and coding strategies and algorithms.Our trading operations cover key exchanges. The firm trades across a broad range of asset classes, including equities, eq ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click