Job Title: Python Data Engineer
Location: Dallas TX
Job Type: Contract
Job Description:
We are seeking a skilled and forwardthinking Python Data Engineer in Dallas TX. This role will focus on designing and optimizing scalable data infrastructure to support advanced machine learning models including Generative AI solutions. The ideal candidate will bring strong proficiency in Python 3.11 experience working with modern Azure data services and a collaborative mindset to work alongside data scientists and ML engineers.
Key Responsibilities:
- Design develop and optimize robust data pipelines and ETL workflows for processing largescale structured and unstructured datasets.
- Work closely with Data Scientists and ML Engineers to support model development training inference and finetuning of Generative AI models (e.g. LLMs).
- Build and maintain feature stores vector databases and embedding pipelines to support retrievalaugmented generation (RAG) and NLP applications.
- Write clean efficient and idiomatic Python 3.11 code for data processing orchestration and integration.
- Leverage Azure Machine Learning to deploy monitor and manage machine learning models in production.
- Implement secure and efficient data workflows using Azure Data Factory and integrate them with other services like Data Lake Synapse Analytics and Azure OpenAI.
- Enforce data quality integrity security and governance best practices across systems.
- Automate data validation monitoring and logging to ensure reliability and scalability of pipelines.
Qualifications :
Required Qualifications:
- 10 Years of overall experience
- Strong proficiency in Python 3.11 with a deep understanding of idiomatic practices and asynchronous programming.
- Proven experience as a Data Engineer working with largescale data processing systems.
- Solid knowledge of data structures algorithms and ETL frameworks.
- Handson experience with Azure Data Factory Azure Data Lake Synapse Analytics and Azure Machine Learning.
- Familiarity with Generative AI concepts such as prompt engineering vector similarity search LLM deployment and tokenization strategies.
- Experience supporting data science workflows including feature engineering model inference pipelines and A/B testing.
- Understanding of ML lifecycle from experimentation to production including version control experiment tracking and model monitoring.
- Working knowledge of data governance and compliance in regulated environments.
Preferred (Bonus) Skills:
- Experience with LLMs (e.g. OpenAI Hugging Face Transformers) and embedding techniques.
- Familiarity with MLOps tools (e.g. MLflow DVC Airflow).
- Exposure to Azure OpenAI Service or similar foundation model deployment environments.
- Proficiency in working with vector databases such as Pinecone FAISS or Weaviate.
- Knowledge of CI/CD for data pipelines and infrastructureascode tools like Terraform.
Additional Information :
All your information will be kept confidential according to EEO guidelines.
Remote Work :
No
Employment Type :
Contract