Job role: Data Engineer (Truck Industry)
Location: Renton WA
Onsite role
Experience Level: Mid to Senior
Need only local candidates and US Citizens and GC
Role Overview
The Data Engineer will play a critical role in building scalable reliable data pipelines to support real-time and batch processing workflows. You will work closely with cross-functional teams to integrate multiple data sources build Operational Data Stores transformations and enable timely data availability for reporting and analytics through dashboards.
Key Responsibilities
Data Ingestion & Integration
Develop and maintain data ingestion pipelines for service and repair data using Confluent Kafka for event streaming.
Implement connectors and integrations between Kafka AWS S3 Google Dataflow and Snowflake to facilitate batch and real-time data flows.
Work with APIs and Apigee to securely ingest and distribute data across internal and external systems including dealer networks.
Data Cleansing & Transformation
Build and optimize data cleansing normalization and transformation pipelines in Google Dataflow for real-time processing.
Design and implement batch transformation jobs within Snowflake building and maintaining the Operational Data Store (ODS).
Ensure data quality consistency and integrity across all processing stages.
Data Publishing & Reporting Support
Publish transformed and aggregated data to internal and external dashboards using APIs Kafka topics and Tableau.
Collaborate with data analysts and business stakeholders to support reporting and analytics requirements.
Monitor and troubleshoot data pipelines to ensure high availability and performance.
Collaboration & Documentation
Partner with data architects analysts and external dealer teams to understand data requirements and source systems.
Document data workflows processing logic and integration specifications.
Adhere to best practices in data security governance and compliance.
Required Technologies & Skills
Event Streaming: Confluent Kafka (proficiency) Kafka Connectors
API Management: Apigee(proficiency)
Cloud Storage & Data Warehousing: AWS S3 Snowflake
Data Processing: Google Dataflow
Programming: SQL Python (proficiency)
Batch & Real-Time Pipeline Development
Data Visualization Support: Tableau (basic understanding for data publishing)
Experience building Operational Data Stores (ODS) and data transformation pipelines in Snowflake
Familiarity with truck industry aftersales or automotive service and repair data is a plus
Qualifications
Bachelors or Masters degree in Computer Science Data Engineering Information Systems or related field.
3 years of proven experience in data engineering especially with streaming and batch data pipelines.
Hands-on experience with Kafka ecosystem (Confluent Kafka Connectors) and cloud data platforms (Snowflake AWS).
Skilled in Python programming for data processing and automation.
Experience with Google Cloud Platform services especially Google Dataflow is highly desirable.
Strong understanding of data modeling ETL/ELT processes and data quality principles.
Ability to work collaboratively in cross-functional teams and communicate technical concepts effectively.
Job role: Data Engineer (Truck Industry) Location: Renton WA Onsite role Experience Level: Mid to Senior Need only local candidates and US Citizens and GC Role Overview The Data Engineer will play a critical role in building scalable reliable data pipelines to support real-time and batch proce...
Job role: Data Engineer (Truck Industry)
Location: Renton WA
Onsite role
Experience Level: Mid to Senior
Need only local candidates and US Citizens and GC
Role Overview
The Data Engineer will play a critical role in building scalable reliable data pipelines to support real-time and batch processing workflows. You will work closely with cross-functional teams to integrate multiple data sources build Operational Data Stores transformations and enable timely data availability for reporting and analytics through dashboards.
Key Responsibilities
Data Ingestion & Integration
Develop and maintain data ingestion pipelines for service and repair data using Confluent Kafka for event streaming.
Implement connectors and integrations between Kafka AWS S3 Google Dataflow and Snowflake to facilitate batch and real-time data flows.
Work with APIs and Apigee to securely ingest and distribute data across internal and external systems including dealer networks.
Data Cleansing & Transformation
Build and optimize data cleansing normalization and transformation pipelines in Google Dataflow for real-time processing.
Design and implement batch transformation jobs within Snowflake building and maintaining the Operational Data Store (ODS).
Ensure data quality consistency and integrity across all processing stages.
Data Publishing & Reporting Support
Publish transformed and aggregated data to internal and external dashboards using APIs Kafka topics and Tableau.
Collaborate with data analysts and business stakeholders to support reporting and analytics requirements.
Monitor and troubleshoot data pipelines to ensure high availability and performance.
Collaboration & Documentation
Partner with data architects analysts and external dealer teams to understand data requirements and source systems.
Document data workflows processing logic and integration specifications.
Adhere to best practices in data security governance and compliance.
Required Technologies & Skills
Event Streaming: Confluent Kafka (proficiency) Kafka Connectors
API Management: Apigee(proficiency)
Cloud Storage & Data Warehousing: AWS S3 Snowflake
Data Processing: Google Dataflow
Programming: SQL Python (proficiency)
Batch & Real-Time Pipeline Development
Data Visualization Support: Tableau (basic understanding for data publishing)
Experience building Operational Data Stores (ODS) and data transformation pipelines in Snowflake
Familiarity with truck industry aftersales or automotive service and repair data is a plus
Qualifications
Bachelors or Masters degree in Computer Science Data Engineering Information Systems or related field.
3 years of proven experience in data engineering especially with streaming and batch data pipelines.
Hands-on experience with Kafka ecosystem (Confluent Kafka Connectors) and cloud data platforms (Snowflake AWS).
Skilled in Python programming for data processing and automation.
Experience with Google Cloud Platform services especially Google Dataflow is highly desirable.
Strong understanding of data modeling ETL/ELT processes and data quality principles.
Ability to work collaboratively in cross-functional teams and communicate technical concepts effectively.
View more
View less