Data Engineer
Required Skills & Experience
- Proficient in Python with deep experience using pandas or polars
- Strong understanding of ETL development data extraction and transformation
- Hands-on experience with SQL and querying large datasets
- Experience deploying workflows on Apache Airflow
- Familiar with web scraping techniques (Selenium is a plus)
- Comfortable working with various data formats and large-scale datasets
- Experience with Azure DevOps including pipeline configuration and automation
- Familiarity with Pytest or equivalent test frameworks
- Strong communication skills and a team-first attitude.
- Experience with Databricks
- Familiarity with AWS services
- Working knowledge of Jenkins and advanced ADO Pipelines
Key Responsibilities
- Design build and maintain pipelines in Python to collect data from a wide range of sources (APIs SFTP servers websites emails PDFs etc.)
- Deploy and orchestrate workflows using Apache Airflow
- Perform web scraping using libraries like requests BeautifulSoup Selenium
- Handle structured semi-structured and unstructured data efficiently
- Transform datasets using pandas and/or polars
- Write unit and component tests using pytest
- Collaborate with platform teams to improve the data scraping framework
- Query and analyze data using SQL (PostgreSQL MSSQL Databricks)
- Conduct code reviews support best practices and improve coding standards across the team
- Manage and maintain CI/CD pipelines (Azure DevOps Pipelines Jenkins)
Tech stack: Main/essential:
- Python - Pandas and/or Polars - Essential
- SQL
- Azure DevOps
- Airflow