GDI Document Intelligence & NLP Specialist

ATS+Partners


Job Location:

Boston, MA - USA

Salary: Not Disclosed
Experience Required: 5years
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

ATSPartners is seeking a GDI / Document Intelligence & NLP Specialist who brings verifiable multi-year experience implementing Intelligent Document Processing and Generative Document Intelligence solutions ideally for government or regulated-industry clients with documented extraction accuracy benchmarks from prior deployments. This role exists specifically to address the two most consequential gaps in the current proposal:


Minimum Criterion 1 requires five or more years of experience implementing AI Automation and GDI solutions preferably for municipal or government clients. The contractors own professional history must provide the concrete dated implementation record that satisfies this threshold.

Minimum Criterion 2 requires a proven ability to achieve 80% data extraction accuracy on unstructured documents. The contractor must be able to supply documented benchmark results from prior engagements not theoretical capability claims that an evaluator can verify.


The contractor will serve as the projects extraction engine expert responsible for the design configuration and continuous improvement of all AI document extraction pipelines. They will own prompt engineering for all four target workflows configure the model ensembles consensus logic build the field-level confidence scoring framework and manage the human-in-the-loop feedback loop that drives continuous accuracy improvement throughout the three-year contract term.


PRIMARY RESPONSIBILITIES

Design and configure AI extraction pipelines for all four target workflows: Invoice Processing Vehicle Insurance Certificate Reconciliation Health Insurance Reconciliation and Audit Preparation.

Develop and maintain custom prompt templates for GPT-4o and Claude AI including field extraction schemas in JSON format field-level instructions with positive and negative examples chain-of-thought validation prompts and conditional extraction rules.

Configure the LangGraph consensus orchestration layer: define how discrepancies between the two model outputs are resolved what confidence thresholds trigger automatic pass-through vs. HITL routing and how reviewer corrections are captured in graph state for retraining.

Build and manage the document classification pipeline: design the taxonomy for audit document categorization train or configure the classifier against City document samples and validate classification accuracy.

Establish and execute the accuracy benchmarking protocol: define hold-out test sets for each workflow run pre-pilot benchmarks pilot accuracy benchmarks and quarterly production accuracy audits.

Lead prompt regression testing: maintain a library of annotated sample documents that are re-run against every prompt update and model version change to detect accuracy regressions before they reach production.

Manage the continuous learning loop: export reviewer corrections from the HITL queue curate training data coordinate model fine-tuning or prompt updates on a quarterly basis.

Configure pre-processing pipeline parameters: image quality thresholds deskewing settings multi-page document segmentation rules and PDF/A normalization parameters for each document type.

Provide domain-specific extraction expertise for ACORD insurance certificates government invoices AP aging schedules GL trial balances and health insurance enrollment forms.

Document all extraction schemas prompt versions confidence threshold configurations and model accuracy results in a version-controlled Model Registry maintained throughout the contract.




Requirements

AI/GDI implementation experience: 5 years implementing Intelligent Document Processing or Generative Document Intelligence solutions in production environments

Confidence scoring and HITL design: experience building tiered confidence routing logic connecting AI extraction outputs to human review queues

Documented extraction accuracy benchmarks: must be able to provide benchmark results (with methodology) from at least two prior IDP deployments showing 80% field-level extraction accuracy on unstructured documents

Government or regulated-industry IDP: at least two prior GDI/IDP implementations for government agencies financial institutions insurance carriers or healthcare organizations

Azure AI Document Intelligence (Form Recogniser): 2 years hands-on configuration of pre-built and custom models for invoice form and certificate extraction

Pre-processing pipeline: hands-on experience with PDF normalization image quality scoring deskewing and format conversion as part of a production IDP pipeline

LLM prompt engineering: 2 years designing structured extraction prompts for GPT-4o Claude or equivalent models including JSON schema output chain-of-thought prompting and regression testing

RAG integration: experience connecting LLM extraction pipelines to vector search (Azure AI Search Pinecone or equivalent) for cross-reference validation

Document classification: experience training or configuring document classifiers for multi-category taxonomies in financial or government document environments

Model accuracy reporting: ability to produce evaluator-grade benchmark reports with field-level precision recall and F1 metrics

PREFERRED QUALIFICATIONS

Massachusetts municipal or state government IDP experience: prior deployment of a document intelligence solution for a Massachusetts city town or state agency (highest priority preference)

LangGraph or LangChain: hands-on experience building stateful multi-agent extraction workflows with checkpointing and HITL interrupt/resume patterns

Insurance document expertise: specific experience extracting from ACORD certificates of insurance health insurance enrollment files or explanation-of-benefits documents

Ensemble architecture: experience building multi-model consensus extraction systems with model-specific confidence weighting

Invoice and AP document expertise: experience with multi-vendor invoice extraction including edge cases (handwritten amounts multi-page invoices non-standard layouts)

Audit document expertise: experience with trial balance extraction GL account mapping and financial work paper generation from source documents

HIPAA-compliant AI pipelines: prior implementation of GDI solutions in HIPAA-governed environments including PHI tokenization and BAA-covered model usage

Fine-tuning and continuous learning: experience running RLHF-style feedback loops or few-shot fine-tuning on domain-specific document corpora

ABBYY Vantage: experience configuring ABBYY for degraded scan processing alongside primary Azure AI Document Intelligence pipeline

Python: proficiency for custom extraction logic pre-processing scripts and benchmarking tooling




Benefits

Company Benefits for Contractors

Flexible project-based consulting opportunities with the ability to work across diverse public private and nonprofit sector IT initiatives.

Exposure to high-impact digital transformation infrastructure modernization cybersecurity cloud and AI-enabled projects that strengthen your portfolio and technical expertise.

Collaborative partnership model that values consultant input innovation and professional autonomy while working alongside experienced client and project leadership teams.

Competitive contractor compensation with opportunities for repeat engagements long-term client relationships and expansion into strategic advisory or technical leadership roles.




Required Skills:

UiPath or Power Automate: 3 years building and deploying production RPA workflows including unattended automation Web automation: proficiency with Selenium UiPath Web Automation or equivalent for browser-based workflow automation Insurance portal automation: documented experience automating web interactions with insurance carrier portals including certificate of insurance retrieval or verification Credential and secrets management: experience using Azure Key Vault HashiCorp Vault or equivalent for secure credential storage in RPA workflows ACORD forms or certificate of insurance processing: experience with variable-format insurance document handling Azure cloud: working knowledge of Azure-hosted RPA infrastructure Azure Active Directory and service principal authentication Vehicle insurance or fleet insurance data management: experience with fleet-level coverage validation VIN-based tracking or commercial auto insurance reconciliation API integration: ability to combine RPA with available APIs where they exist using hybrid API-plus-RPA patterns Intelligent RPA (iRPA): experience with semantic/AI-assisted element targeting for resilience against portal UI changes Regulated environment experience: prior work in government insurance or healthcare environments with compliance logging requirements

ATSPartners is seeking a GDI / Document Intelligence & NLP Specialist who brings verifiable multi-year experience implementing Intelligent Document Processing and Generative Document Intelligence solutions ideally for government or regulated-industry clients with documented extraction accuracy ben...