drjobs Member of Technical Staff – Data Infra

Member of Technical Staff – Data Infra

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

San Francisco, CA - USA

Monthly Salary drjobs

$ 150 - 425

Vacancy

1 Vacancy

Job Description

Tzafon is a foundation model lab building scalable compute systems and advancing machine intelligence with offices in San Francisco Stockholm & Tel Aviv. We recently raised $9.7m in pre-seed funding to advance our mission of expanding the frontiers of machine intelligence.

Were a team of engineers and scientists with deep backgrounds in ML infrastructure & research. Founded by IOI and IMO medalists PhDs and alumni from leading tech companies we train models and build infrastructure for swarms of agents to automate work across real-world environments.

This role will work closely with our researchers on collecting and preparing data for the training of our foundation models. Youll be developing the data engine that powers our models ensuring it is clean diverse and high-quality.

What Youll Do

  • Build and maintain scalable data pipelines for training and fine-tuning LLMs and agent models

  • Create and optimize distributed computing systems for processing web-scale datasets

  • Clean deduplicate normalize and cluster diverse datasets across structured and unstructured sources

  • Design robust pipelines using tools like Spark BigQuery DBT and Airflow

  • Collaborate with researchers and engineers to develop reproducible dataset curation workflows

  • Monitor data quality and build tools for versioning observability and auditing

  • Help define what great data looks like for real-world intelligent agents

  • Develop and maintain core processing primitives (e.g. tokenization deduplication chunking) with a focus on scalability

Were looking for

  • Have 3 years of full-time experience as a data engineer and 6 years of any software engineering experience (including data engineering).

  • Proficiency in Python Scala or Java

  • Solid understanding of Spark and ability to write debug and optimize Spark code

  • Familiarity with GCP BigQuery DBT Trino Hex and other cloud-based data and analytics platforms

  • Experience with ML datasets and data preparation for model training

  • Excited about joining a fast-moving research team to shape the quality of intelligence from the ground up

Sample Projects

  • Designing and implementing distributed computing architecture for web-scale data processing

  • Building scalable infrastructure for model training data preparation

  • Creating comprehensive monitoring and alerting systems

  • Optimizing tokenization infrastructure for improved throughput

  • Developing fault-tolerant distributed processing systems

  • Implementing new infrastructure components based on research requirements

  • Building automated testing frameworks for distributed systems

Life at Tzafon

  • Full medical dental and vision coverage plus 401(k)

  • Office in SF and Tel Aviv

  • Early-stage equity in a future-defining company

Visa sponsorship: We do sponsor visas! However we arent able to successfully sponsor visas for every role and every candidate. But if we make you an offer we will make every reasonable effort to get you a visa and we retain an immigration lawyer to help with this.

Compensation

Compensation starts at $150k-$425k and equity package.

We also offer a referral bonus of $20k for referral of successful hires (send to ).


Required Experience:

Staff IC

Employment Type

Full-Time

Company Industry

Key Skills

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.