Data Engineer AI

Impact Bridge Consulting

Job Location:

Marshall County, WV - USA

Monthly Salary: Not Disclosed

Posted on: 2 hours ago

Vacancies: 1 Vacancy

Job Summary

Data Engineer - AI
Dallas TX (preferred) Hybrid (Bishop Arts preferred) Full-time
Reports to the Founding AI / Engineering Team

Why this role exists

Our Client is an AI-powered contract intelligence platform that validates purchased services invoices against contract terms before payment turning contracts into enforceable controls within healthcare procure-to-pay workflows .

The platform processes massive volumes of contracts invoices vendor records and transactional data. A single enterprise customer may generate over 30000 invoice and contract-related documents monthly all requiring ingestion extraction normalization validation monitoring and analytics.

The companys founding engineering team is currently focused on building higher-level AI systems semantic layers ontology frameworks and enterprise-scale platform architecture. This role exists to own the implementation and operationalization layer underneath that vision building and maintaining the pipelines reporting systems integrations and scalable data infrastructure that allow the platform to operate reliably at enterprise scale.

This is not a pure analytics role and not a pure research role. It is a hands-on engineering role for someone who can build production-grade data pipelines while also understanding how modern AI ML LLM and knowledge graph systems operate.

If you enjoy building scalable data systems handling messy enterprise data operationalizing AI pipelines and creating infrastructure that powers enterprise SaaS products this role will feel like a strong fit.

What youll own

Enterprise Data Pipeline Engineering

Build maintain and optimize large-scale ETL/ELT pipelines for contracts invoices logs traces events and operational data
Support enterprise-scale ingestion and processing workflows for healthcare procurement and AP data
Design resilient streaming and batch processing systems
Help operationalize the platform for enterprise-grade customer workloads
Improve pipeline reliability observability scalability and monitoring
Support distributed data processing workflows across large document and transactional datasets

Reporting Operational Analytics

Build internal and customer-facing reporting systems showing document processing status validation outcomes exceptions and operational insights
Create dashboards and analytics layers that provide actionable insights from invoice and contract data
Develop ad hoc reporting capabilities for founders GTM teams customers and investors
Help identify trends gaps anomalies and operational patterns across purchased services spend
Translate raw platform data into usable operational intelligence

AI ML Data Infrastructure

Support AI and ML pipelines powering contract intelligence and invoice validation workflows
Build infrastructure supporting LLM ML and semantic data workflows
Work alongside engineers building ontology layers semantic layers and knowledge graph systems
Help structure and operationalize datasets for AI-driven applications
Support vector database semantic retrieval and modern AI architecture workflows
Understand how data flows through MLOps and LLMOps environments

Platform Data Foundations

Help maintain and improve the companys core data architecture
Support enterprise-grade logging tracing monitoring and event management systems
Build scalable data lake and storage workflows
Improve system reliability and operational visibility as customer scale increases
Collaborate closely with AI engineers and platform leadership on implementation and execution.

What Success Looks Like (First 90 Days)

First 45 Days

Ramp quickly on the AI platform pipeline architecture and customer workflows
Understand how contracts invoices validation systems and analytics pipelines interact
Identify gaps in pipeline reliability reporting and data quality
Begin contributing production-ready improvements to core pipelines and operational systems

By 90 Days

Core reporting and analytics workflows are operational and scalable
Enterprise pipeline reliability and monitoring improve measurably
Data quality and processing visibility improve across customer workflows
Internal teams can access cleaner operational reporting and analytics
Founders and customer-facing teams can generate custom reporting more efficiently
AI and semantic systems receive more reliable and structured downstream data
You are independently building and maintaining production data workflows with minimal oversight

The profile that tends to win here

You are first and foremost a strong engineer who can build and maintain production systems
You have experience working with enterprise-scale or mid-market data environments not only early-stage startups
Youve worked with large-scale transactional operational or machine-generated datasets
You understand modern AI/ML ecosystems well enough to support them operationally
You are comfortable dealing with ambiguity and evolving infrastructure
You think systematically about scalability reliability and maintainability
You can move comfortably between infrastructure pipelines analytics and operational engineering
You are highly analytical and naturally curious about patterns anomalies and data quality
You move quickly fail fast and care deeply about accuracy and operational quality

Qualifications

48 years of experience in Data Engineering Platform Engineering or Backend/Data Infrastructure roles
Strong experience building ETL/ELT pipelines in production environments
Experience with distributed data processing systems
Experience handling streaming and batch data workflows
Strong SQL and Python skills
Experience with modern cloud infrastructure (AWS GCP or Azure)
Experience working with data lakes and large-scale operational datasets
Experience handling logs traces events and telemetry-style data
Familiarity with ML pipelines vector databases or modern AI data architectures
Understanding of MLOps and/or LLMOps concepts
Experience building reporting systems dashboards and operational analytics workflows
Comfortable working in fast-moving startup environments with evolving requirements

Strongly Preferred:

Experience supporting AI/LLM-driven products
Exposure to knowledge graphs semantic layers or ontology systems
Experience in enterprise SaaS environments
Experience with observability and monitoring tooling
Familiarity with healthcare procurement AP automation or invoice processing systems
Experience building customer-facing analytics systems
Experience supporting high-volume document processing systems

Compensation benefits

Competitive base salary variable comp potential for future equity
Opportunity to help build foundational infrastructure at an early-stage AI company
High ownership and direct technical impact
Flexible and remote-friendly environment
Opportunity to work on cutting-edge AI enterprise data infrastructure problems.

Data Engineer - AIDallas TX (preferred) Hybrid (Bishop Arts preferred) Full-timeReports to the Founding AI / Engineering TeamWhy this role existsOur Client is an AI-powered contract intelligence platform that validates purchased services invoices against contract terms before payment turning contr...

Data Engineer - AI
Dallas TX (preferred) Hybrid (Bishop Arts preferred) Full-time
Reports to the Founding AI / Engineering Team

Why this role exists

What youll own

Enterprise Data Pipeline Engineering

Build maintain and optimize large-scale ETL/ELT pipelines for contracts invoices logs traces events and operational data
Support enterprise-scale ingestion and processing workflows for healthcare procurement and AP data
Design resilient streaming and batch processing systems
Help operationalize the platform for enterprise-grade customer workloads
Improve pipeline reliability observability scalability and monitoring
Support distributed data processing workflows across large document and transactional datasets

Reporting Operational Analytics

Build internal and customer-facing reporting systems showing document processing status validation outcomes exceptions and operational insights
Create dashboards and analytics layers that provide actionable insights from invoice and contract data
Develop ad hoc reporting capabilities for founders GTM teams customers and investors
Help identify trends gaps anomalies and operational patterns across purchased services spend
Translate raw platform data into usable operational intelligence

AI ML Data Infrastructure

Support AI and ML pipelines powering contract intelligence and invoice validation workflows
Build infrastructure supporting LLM ML and semantic data workflows
Work alongside engineers building ontology layers semantic layers and knowledge graph systems
Help structure and operationalize datasets for AI-driven applications
Support vector database semantic retrieval and modern AI architecture workflows
Understand how data flows through MLOps and LLMOps environments

Platform Data Foundations

Help maintain and improve the companys core data architecture
Support enterprise-grade logging tracing monitoring and event management systems
Build scalable data lake and storage workflows
Improve system reliability and operational visibility as customer scale increases
Collaborate closely with AI engineers and platform leadership on implementation and execution.

What Success Looks Like (First 90 Days)

First 45 Days

Ramp quickly on the AI platform pipeline architecture and customer workflows
Understand how contracts invoices validation systems and analytics pipelines interact
Identify gaps in pipeline reliability reporting and data quality
Begin contributing production-ready improvements to core pipelines and operational systems

By 90 Days

Core reporting and analytics workflows are operational and scalable
Enterprise pipeline reliability and monitoring improve measurably
Data quality and processing visibility improve across customer workflows
Internal teams can access cleaner operational reporting and analytics
Founders and customer-facing teams can generate custom reporting more efficiently
AI and semantic systems receive more reliable and structured downstream data
You are independently building and maintaining production data workflows with minimal oversight

The profile that tends to win here

You are first and foremost a strong engineer who can build and maintain production systems
You have experience working with enterprise-scale or mid-market data environments not only early-stage startups
Youve worked with large-scale transactional operational or machine-generated datasets
You understand modern AI/ML ecosystems well enough to support them operationally
You are comfortable dealing with ambiguity and evolving infrastructure
You think systematically about scalability reliability and maintainability
You can move comfortably between infrastructure pipelines analytics and operational engineering
You are highly analytical and naturally curious about patterns anomalies and data quality
You move quickly fail fast and care deeply about accuracy and operational quality

Qualifications

48 years of experience in Data Engineering Platform Engineering or Backend/Data Infrastructure roles
Strong experience building ETL/ELT pipelines in production environments
Experience with distributed data processing systems
Experience handling streaming and batch data workflows
Strong SQL and Python skills
Experience with modern cloud infrastructure (AWS GCP or Azure)
Experience working with data lakes and large-scale operational datasets
Experience handling logs traces events and telemetry-style data
Familiarity with ML pipelines vector databases or modern AI data architectures
Understanding of MLOps and/or LLMOps concepts
Experience building reporting systems dashboards and operational analytics workflows
Comfortable working in fast-moving startup environments with evolving requirements

Strongly Preferred:

Experience supporting AI/LLM-driven products
Exposure to knowledge graphs semantic layers or ontology systems
Experience in enterprise SaaS environments
Experience with observability and monitoring tooling
Familiarity with healthcare procurement AP automation or invoice processing systems
Experience building customer-facing analytics systems
Experience supporting high-volume document processing systems

Compensation benefits

Competitive base salary variable comp potential for future equity
Opportunity to help build foundational infrastructure at an early-stage AI company
High ownership and direct technical impact
Flexible and remote-friendly environment
Opportunity to work on cutting-edge AI enterprise data infrastructure problems.

Apply Now

About Company

Impact Bridge Consulting

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Data Engineer AI

Marshall County, WV - USA

Job Summary

Why this role exists

What youll own

Reporting Operational Analytics

AI ML Data Infrastructure

Platform Data Foundations

Help maintain and improve the companys core data architecture

Support enterprise-grade logging tracing monitoring and event management systems

Build scalable data lake and storage workflows

Improve system reliability and operational visibility as customer scale increases

Collaborate closely with AI engineers and platform leadership on implementation and execution.

What Success Looks Like (First 90 Days)

The profile that tends to win here

You are first and foremost a strong engineer who can build and maintain production systems

You have experience working with enterprise-scale or mid-market data environments not only early-stage startups

Youve worked with large-scale transactional operational or machine-generated datasets

You understand modern AI/ML ecosystems well enough to support them operationally

You are comfortable dealing with ambiguity and evolving infrastructure

You think systematically about scalability reliability and maintainability

You can move comfortably between infrastructure pipelines analytics and operational engineering

You are highly analytical and naturally curious about patterns anomalies and data quality

You move quickly fail fast and care deeply about accuracy and operational quality

Qualifications

Compensation benefits

Why this role exists

What youll own

Reporting Operational Analytics

AI ML Data Infrastructure

Platform Data Foundations

Help maintain and improve the companys core data architecture

Support enterprise-grade logging tracing monitoring and event management systems

Build scalable data lake and storage workflows

Improve system reliability and operational visibility as customer scale increases

Collaborate closely with AI engineers and platform leadership on implementation and execution.

What Success Looks Like (First 90 Days)

The profile that tends to win here

You are first and foremost a strong engineer who can build and maintain production systems

You have experience working with enterprise-scale or mid-market data environments not only early-stage startups

Youve worked with large-scale transactional operational or machine-generated datasets

You understand modern AI/ML ecosystems well enough to support them operationally

You are comfortable dealing with ambiguity and evolving infrastructure

You think systematically about scalability reliability and maintainability

You can move comfortably between infrastructure pipelines analytics and operational engineering

You are highly analytical and naturally curious about patterns anomalies and data quality

You move quickly fail fast and care deeply about accuracy and operational quality

Qualifications

Compensation benefits

About Company

Related Jobs