Sr Advanced Data Engineer

Peachtree, GA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Description

KEY RESPONSIBILITIES

Data Engineering & AI Pipeline Development:

Design and implement scalable data architectures to process high-volume IoT sensor data and telemetry streams ensuring reliable data capture and processing for AI/ML workloads
Build and maintain data pipelines for AI product lifecycle including training data preparation feature engineering and inference data flows
Develop and optimize RAG (Retrieval Augmented Generation) systems including vector databases embedding pipelines and efficient retrieval mechanisms
Lead the architecture and development of scalable data platforms on Databricks
Drive the integration of GenAI capabilities into data workflows and applications
Optimize data processing for performance cost and reliability at scale
Create robust data integration solutions that combine industrial IoT data streams with enterprise data sources for AI model training and inference

DataOps:

Implement DataOps practices to ensure continuous integration and delivery of data pipelines powering AI solutions
Design and maintain automated testing frameworks for data quality data drift detection and AI model performance monitoring
Create self-service data assets enabling data scientists and ML engineers to access and utilize data efficiently
Design and maintain automated documentation for data lineage and AI model provenance

Collaboration & Innovation:

Partner with ML engineers and data scientists to implement efficient data workflows for model training fine-tuning and deployment
Mentor team members and provide technical leadership on complex data engineering challenges
Establish data engineering best practices including modular code design and reusable frameworks
Drive projects to completion while working in an agile environment with evolving requirements in the rapidly changing AI landscape

Qualifications

YOU MUST HAVE

Education- Masters degree in computer science Engineering Applied Mathematics or related STEM field
5 years building production data pipelines in Databricks processing TB scale data
3 years of experience implementing medallion architecture (Bronze/Silver/Gold) with Delta Lake Delta Live Tables (DLT) and Lakeflow for batch and streaming pipelines from Event Hub or Kafka sources
3 years of experience and hands-on proficiency with PySpark for distributed data processing and transformation
3 years experience working with cloud platforms such as Azure GCP and Databricks especially in designing and implementing AI/ML-driven data workflows
Proficient in CI/CD practices using Databricks Asset Bundles (DAB) Git workflows GitHub Actions and understanding of DataOps practices including data quality testing and observability
Hands-on experience building RAG applications with vector databases LLM integration and agentic frameworks like LangChain LangGraph

WE VALUE

Experience building RAG and agentic architecture solutions and working with LLM-powered applications
Expertise in real-time data processing frameworks (Apache Spark Streaming Structured Streaming)
Knowledge of MLOps practices and experience building data pipelines for AI model deployment
Experience with time-series databases and IoT data modeling patterns
Familiarity with containerization (Docker) and orchestration (Kubernetes) for AI workloads
Strong background in data quality implementation for AI training data
Experience working with distributed teams and cross-functional collaboration
Knowledge of data security and governance practices for AI systems
Experience working on analytics projects with Agile and Scrum Methodologies
Natural analytical mindset with demonstrated ability to explore data debug complex distributed systems and optimize pipeline performance at scale

U.S. PERSONS CONSIDERATIONS:

Due to compliance with U.S. export control laws and regulations candidate must be a U.S. Person which is defined as a U.S. citizen a U.S. permanent resident or have protected status in the U.S. under asylum or refugee status or have the ability to obtain an export authorization.

BENEFITS AT HONEYWELL:

In addition to a competitive salary leading-edge work and developing solutions side-by-side with dedicated experts in their fields Honeywell employees are eligible for a comprehensive benefits package. This package includes employer subsidized Medical Dental Vision and Life Insurance; Short-Term and Long-Term Disability; 401(k) match Flexible Spending Accounts Health Savings Accounts EAP and Educational Assistance; Parental Leave Paid Time Off (for vacation personal business sick time and parental leave) and 12 Paid Holidays .For more Honeywell Benefits information visit: application period for the job is estimated to be 40 days from the job posting date; however this may be shortened or extended depending on business needs and the availability of qualified candidates. Job Posting Date: 11/11/2025

Required Experience:

Senior IC

DescriptionKEY RESPONSIBILITIESData Engineering & AI Pipeline Development:Design and implement scalable data architectures to process high-volume IoT sensor data and telemetry streams ensuring reliable data capture and processing for AI/ML workloadsBuild and maintain data pipelines for AI product li...

Description

KEY RESPONSIBILITIES

Data Engineering & AI Pipeline Development:

Design and implement scalable data architectures to process high-volume IoT sensor data and telemetry streams ensuring reliable data capture and processing for AI/ML workloads
Build and maintain data pipelines for AI product lifecycle including training data preparation feature engineering and inference data flows
Develop and optimize RAG (Retrieval Augmented Generation) systems including vector databases embedding pipelines and efficient retrieval mechanisms
Lead the architecture and development of scalable data platforms on Databricks
Drive the integration of GenAI capabilities into data workflows and applications
Optimize data processing for performance cost and reliability at scale
Create robust data integration solutions that combine industrial IoT data streams with enterprise data sources for AI model training and inference

DataOps:

Implement DataOps practices to ensure continuous integration and delivery of data pipelines powering AI solutions
Design and maintain automated testing frameworks for data quality data drift detection and AI model performance monitoring
Create self-service data assets enabling data scientists and ML engineers to access and utilize data efficiently
Design and maintain automated documentation for data lineage and AI model provenance

Collaboration & Innovation:

Partner with ML engineers and data scientists to implement efficient data workflows for model training fine-tuning and deployment
Mentor team members and provide technical leadership on complex data engineering challenges
Establish data engineering best practices including modular code design and reusable frameworks
Drive projects to completion while working in an agile environment with evolving requirements in the rapidly changing AI landscape

Qualifications

YOU MUST HAVE

Education- Masters degree in computer science Engineering Applied Mathematics or related STEM field
5 years building production data pipelines in Databricks processing TB scale data
3 years of experience implementing medallion architecture (Bronze/Silver/Gold) with Delta Lake Delta Live Tables (DLT) and Lakeflow for batch and streaming pipelines from Event Hub or Kafka sources
3 years of experience and hands-on proficiency with PySpark for distributed data processing and transformation
3 years experience working with cloud platforms such as Azure GCP and Databricks especially in designing and implementing AI/ML-driven data workflows
Proficient in CI/CD practices using Databricks Asset Bundles (DAB) Git workflows GitHub Actions and understanding of DataOps practices including data quality testing and observability
Hands-on experience building RAG applications with vector databases LLM integration and agentic frameworks like LangChain LangGraph

WE VALUE

Experience building RAG and agentic architecture solutions and working with LLM-powered applications
Expertise in real-time data processing frameworks (Apache Spark Streaming Structured Streaming)
Knowledge of MLOps practices and experience building data pipelines for AI model deployment
Experience with time-series databases and IoT data modeling patterns
Familiarity with containerization (Docker) and orchestration (Kubernetes) for AI workloads
Strong background in data quality implementation for AI training data
Experience working with distributed teams and cross-functional collaboration
Knowledge of data security and governance practices for AI systems
Experience working on analytics projects with Agile and Scrum Methodologies
Natural analytical mindset with demonstrated ability to explore data debug complex distributed systems and optimize pipeline performance at scale

U.S. PERSONS CONSIDERATIONS:

BENEFITS AT HONEYWELL:

Required Experience:

Senior IC

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

Honeywell

Honeywell helps organizations solve the world's most complex challenges in automation, the future of aviation and energy transition. As a trusted partner, we provide actionable solutions and innovation through our Aerospace Technologies, Building Automation, Energy and Sustainability ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click