Senior Data Engineer (Python Spark Azure)

Not Interested
Bookmark
Report This Job

profile Job Location:

Plano, TX - USA

profile Monthly Salary: Not Disclosed
Posted on: 3 hours ago
Vacancies: 1 Vacancy

Job Summary

Responsibilities
  • Data Ingestion & Processing Architecture
  • Multiple high velocity data sources including:
  • Event Hub from Postgres
  • Data pushed to Snowflake
  • Additional events across various internal systems
Heavy use of:
  • Kubernetes for scalable processing
  • Databricks / Spark for distributed computing
  • Snowflake for storage and downstream analytics
Skills Must have
  • Backend Engineering:
  • Python (primary)
  • FastAPI (service/API development)
  • Full stack engineering without UI (backend data infra only)
  • Cloud & Containerization:
  • Azure (core cloud environment)
  • Docker
  • Kubernetes / AKS
  • Experience running high scale workloads in K8s
  • Data Engineering & Distributed Computing:
  • Spark / Databricks
Experience handling:
  • Very large datasets
  • Complex pipelines
  • High message volume
  • Mixed batch streaming data flows
  • Designing & maintaining table schemas
  • Working with Snowflake
Database & Data Handling:
  • Strong SQL data manipulation skills
  • Experience integrating multiple data sources
  • Comfort navigating event streaming ecosystems
Hands On LLM Integration:
  • Experience integrating LLMs into applications
  • Prompt design/optimization
  • RAG pipelines
  • Vector databases & embedding models
  • Model orchestration patterns
  • Security & compliance for AI systems
  • Model Monitoring & Optimization:
  • Prompt evaluation frameworks
  • Managing cost/performance tradeoffs
Responsibilities Data Ingestion & Processing Architecture Multiple high velocity data sources including: Event Hub from Postgres Data pushed to Snowflake Additional events across various internal systems Heavy use of: Kubernetes for scalable processing Databricks / Spark for distributed computi...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala