Sr. Data Engineer

Dallas, IA - USA

Monthly Salary: Not Disclosed

Posted on: 8 days ago

Vacancies: 1 Vacancy

Job Summary

Job Title: Senior Data Engineer

Location: New York City or Plano TX (3 days onsite & 2 days remote)(open to relocation candidates)

Duration: 12 months; likely extensions

Notes:

Core Role Focus

Primarily a Data Engineer role
- 80% Data Engineering
- 20% ML exposure

ML is not the primary focus - strong Data Engineering fundamentals are mandatory.

Must Have Technical Skills

Python (pandas dataframes - data engineering use cases)
PySpark / Spark
Databricks
AWS ecosystem
- S3
- Core AWS services EMR Glue Lambda etc

Exposure to Java is a plus but not mandatory
Pipeline design & automation
- High volume data processing
- SCD (slowly changing dimensions)

Streaming & near real time data
- Kafka (must understand consumption even if not hands on at Cap One)
- APIs
- Micro batching (with emphasis on high volume use cases)

Candidates with only micro batching experience and no exposure to large scale volume pipelines are not strong fits.

Job Description:

Overview

Seeking a strong hands on Data Engineer to join a fast moving Cybersecurity organization focused on threat detection correlation and automated remediation. This role is heavily data engineering focused (approximately 80% Data Engineering / 20% ML exposure) and requires deep fundamentals not surface level experience.

This team works with large scale high volume data pipelines that support near real time security analytics and GenAI driven tools used by Cyber Operations teams and executive leadership.

Key Responsibilities

Design build and maintain scalable data pipelines handling large volumes of structured and semi structured data
Develop and optimize pipelines using PySpark and Databricks
Implement data ingestion transformation and automation workflows in AWS
Work with real time and near real time data sources including Kafka and APIs
Design pipelines supporting high volume processing (beyond simple micro batching)
Apply best practices around:
- Data quality
- Performance optimization
- Pipeline reliability and scalability
Collaborate with cybersecurity data science and platform teams to support:
- Threat detection use cases
- Log analysis and security telemetry
- GenAI powered data products
Participate in technical and behavioral interviews including hands on discussions and screen sharing exercises

Required Qualifications

Job Title: Senior Data Engineer Location: New York City or Plano TX (3 days onsite & 2 days remote)(open to relocation candidates) Duration: 12 months; likely extensions Notes: Core Role Focus Primarily a Data Engineer role 80% Data Engineering 20% ML exposure ML is not the primary focus...