Role: Sr. Data and GenAI Engineer (Finance Domain).
Location: Dallas TX (Onsite).
Duration: Long Term Contract.
Reworked Job Summary:
- The Sr Data and GenAI Engineer will architect knowledge graph and vector database infrastructure optimized for LLM fine-tuning and low-latency inference in financial services. This hands-on role owns GraphDB/Neo4j pipeline design RAG systems embedding generation and data preprocessing for enterprise-grade AI serving financial domain use cases (transactions risk compliance).
Key Responsibilities:
Graph & Vector Infrastructure:
- Design GraphDB solutions (Neo4j Amazon Neptune) modeling complex financial relationships (counterparties transactions risk exposures ownership hierarchies).
- Implement VectorDB infrastructure (Pinecone Weaviate pgvector) for semantic search and RAG supporting financial document retrieval.
- Build hybrid GraphRAG pipelines combining structured relationship traversal with unstructured semantic similarity.
Knowledge Graph Pipelines:
- Create multi-source knowledge graph ingestion from transactional systems market data KYC/AML feeds and regulatory documents.
- Implement entity resolution relationship extraction and temporal graph modeling for financial lineage and compliance.
- Design graph query optimization for real-time risk analysis and recommendation systems.
LLM Data Preparation:
- Build scalable embedding pipelines using Sentence Transformers OpenAI embeddings or financial domain models.
- Implement data preprocessing workflows for LLM fine-tuning (deduplication chunking metadata enrichment).
- Orchestrate RAG pipelines with financial context retrieval prompt engineering and response synthesis.
Model Deployment & Inference:
- Deploy fine-tuned LLMs on GPU servers (NVIDIA A100/H100) with vLLM TensorRT-LLM or TGI for optimized inference.
- Implement model serving infrastructure with auto-scaling request queuing and latency monitoring.
- Design multi-tenant isolation and cost governance for production AI workloads.
Data Governance & Compliance:
- Ensure data lineage auditability and regulatory compliance (SEC FINRA GDPR) across AI pipelines.
- Implement access controls PII masking and model explainability for financial governance.
Technical Stack:
Graph Database:
- Neo4j: Cypher queries APOC Bloom visualization.
- Amazon Neptune: Gremlin/SPARQL GraphRAG.
- TigerGraph: GSQL for financial analytics.
Vector Database:
- Pinecone: Serverless metadata filtering.
- Weaviate: Hybrid search GraphQL.
- pgvector: PostgreSQL native vectors.
- Milvus: High-throughput financial embeddings.
LLM Infrastructure:
- vLLM/TGI: OpenAI-compatible serving.
- TensorRT-LLM: NVIDIA inference optimization.
- Ray Serve: Multi-model orchestration.
- Kubernetes: GPU auto-scaling.
Data Engineering:
- Apache Airflow: Pipeline orchestration.
- dbt: Transformation testing.
- Great Expectations: Data quality.
- Monte Carlo: Observability.
Financial Domain Expertise (Required):
- Transaction Graphs: Payment networks trade settlement.
- Risk Networks: Counterparty exposure concentration risk.
- KYC/AML: Entity resolution sanctions screening.
- Compliance: Regulatory relationship mapping.
- Wealth Management: Portfolio holdings ownership chains.
Role: Sr. Data and GenAI Engineer (Finance Domain). Location: Dallas TX (Onsite). Duration: Long Term Contract. Reworked Job Summary: The Sr Data and GenAI Engineer will architect knowledge graph and vector database infrastructure optimized for LLM fine-tuning and low-latency inference in financ...
Role: Sr. Data and GenAI Engineer (Finance Domain).
Location: Dallas TX (Onsite).
Duration: Long Term Contract.
Reworked Job Summary:
- The Sr Data and GenAI Engineer will architect knowledge graph and vector database infrastructure optimized for LLM fine-tuning and low-latency inference in financial services. This hands-on role owns GraphDB/Neo4j pipeline design RAG systems embedding generation and data preprocessing for enterprise-grade AI serving financial domain use cases (transactions risk compliance).
Key Responsibilities:
Graph & Vector Infrastructure:
- Design GraphDB solutions (Neo4j Amazon Neptune) modeling complex financial relationships (counterparties transactions risk exposures ownership hierarchies).
- Implement VectorDB infrastructure (Pinecone Weaviate pgvector) for semantic search and RAG supporting financial document retrieval.
- Build hybrid GraphRAG pipelines combining structured relationship traversal with unstructured semantic similarity.
Knowledge Graph Pipelines:
- Create multi-source knowledge graph ingestion from transactional systems market data KYC/AML feeds and regulatory documents.
- Implement entity resolution relationship extraction and temporal graph modeling for financial lineage and compliance.
- Design graph query optimization for real-time risk analysis and recommendation systems.
LLM Data Preparation:
- Build scalable embedding pipelines using Sentence Transformers OpenAI embeddings or financial domain models.
- Implement data preprocessing workflows for LLM fine-tuning (deduplication chunking metadata enrichment).
- Orchestrate RAG pipelines with financial context retrieval prompt engineering and response synthesis.
Model Deployment & Inference:
- Deploy fine-tuned LLMs on GPU servers (NVIDIA A100/H100) with vLLM TensorRT-LLM or TGI for optimized inference.
- Implement model serving infrastructure with auto-scaling request queuing and latency monitoring.
- Design multi-tenant isolation and cost governance for production AI workloads.
Data Governance & Compliance:
- Ensure data lineage auditability and regulatory compliance (SEC FINRA GDPR) across AI pipelines.
- Implement access controls PII masking and model explainability for financial governance.
Technical Stack:
Graph Database:
- Neo4j: Cypher queries APOC Bloom visualization.
- Amazon Neptune: Gremlin/SPARQL GraphRAG.
- TigerGraph: GSQL for financial analytics.
Vector Database:
- Pinecone: Serverless metadata filtering.
- Weaviate: Hybrid search GraphQL.
- pgvector: PostgreSQL native vectors.
- Milvus: High-throughput financial embeddings.
LLM Infrastructure:
- vLLM/TGI: OpenAI-compatible serving.
- TensorRT-LLM: NVIDIA inference optimization.
- Ray Serve: Multi-model orchestration.
- Kubernetes: GPU auto-scaling.
Data Engineering:
- Apache Airflow: Pipeline orchestration.
- dbt: Transformation testing.
- Great Expectations: Data quality.
- Monte Carlo: Observability.
Financial Domain Expertise (Required):
- Transaction Graphs: Payment networks trade settlement.
- Risk Networks: Counterparty exposure concentration risk.
- KYC/AML: Entity resolution sanctions screening.
- Compliance: Regulatory relationship mapping.
- Wealth Management: Portfolio holdings ownership chains.
View more
View less