Project the aim youll have
Our customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable configurable screening programs that meet the unique needs of over 33000 clients worldwide. Headquartered in Atlanta GA they have an internationally distributed workforce spanning 19 countries with about 5500 employees. Our partner perform over 93 million screens annually in over 200 countries and territories.
We are seeking a Senior Data Engineer with solid Python/PySpark programming skills to join the Data Engineering Team and help us build the Data Analytics Platform in Azure cloud.
Position how youll contribute
- Develop reusable metadata-driven data pipelines
- Automate and optimize any data platform related processes
- Build integrations with data sources and data consumers
- Add data transformation methods to shared ETL libraries
- Write unit tests
- Develop solutions for the Databricks data platform monitoring
- Proactively resolve any performance or quality issues in ETL processes
- Cooperate with infrastructure engineering team to set up cloud resources
- Contribute to data platform wiki / documentation
- Perform code reviews and ensures code quality
- Initiate and implements improvements to the data platform architecture
Qualifications :
Expectations the experience you need
- Programming: Python/PySpark SQL
- Proficient in building robust data pipelines using Databricks Spark
- Experienced in dealing with large and complex datasets
- Knowledgeable about building data transformations modules organized as libraries (Python packages)
- Familiar with Databricks Delta optimization techniques (partitioning z-ordering compaction etc.)
- Experienced in developing CI/CD pipelines
- Experienced in leveraging event brokers (Kafka /Event Hubs / Kinesis) to integrate with data sources and data consumers
- Understanding of basic networking concepts
- Familiar with Agile Software Development methodologies (Scrum)
Additional skills the edge you have
- Understanding of stream processing challenges and familiarity with Spark Structured Streaming
- Experience with IaC (Terraform Bicep or other)
- Experience running containerized applications (Azure Container Apps Kubernetes)
- Experience building event sourcing solutions
- Familiarity with platforms for change data capture (e.g. Debezium)
- Knowledge of Azure cloud native solutions (e.g. Azure Data Factory Azure Function App Azure Container Instances)
Additional Information :
Our offer professional development personal growth:
- Flexible employment and remote work
- International projects with leading global clients
- International business trips
- Non-corporate atmosphere
- Language classes
- Internal & external training
- Private healthcare and insurance
- Multisport card
- Well-being initiatives
Position at: Software Mind Poland
This role requires candidates to be based in Poland.
Remote Work :
Yes
Employment Type :
Full-time
Project the aim youll haveOur customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable configurable screening programs that meet the unique needs of over 33000 clients worl...
Project the aim youll have
Our customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable configurable screening programs that meet the unique needs of over 33000 clients worldwide. Headquartered in Atlanta GA they have an internationally distributed workforce spanning 19 countries with about 5500 employees. Our partner perform over 93 million screens annually in over 200 countries and territories.
We are seeking a Senior Data Engineer with solid Python/PySpark programming skills to join the Data Engineering Team and help us build the Data Analytics Platform in Azure cloud.
Position how youll contribute
- Develop reusable metadata-driven data pipelines
- Automate and optimize any data platform related processes
- Build integrations with data sources and data consumers
- Add data transformation methods to shared ETL libraries
- Write unit tests
- Develop solutions for the Databricks data platform monitoring
- Proactively resolve any performance or quality issues in ETL processes
- Cooperate with infrastructure engineering team to set up cloud resources
- Contribute to data platform wiki / documentation
- Perform code reviews and ensures code quality
- Initiate and implements improvements to the data platform architecture
Qualifications :
Expectations the experience you need
- Programming: Python/PySpark SQL
- Proficient in building robust data pipelines using Databricks Spark
- Experienced in dealing with large and complex datasets
- Knowledgeable about building data transformations modules organized as libraries (Python packages)
- Familiar with Databricks Delta optimization techniques (partitioning z-ordering compaction etc.)
- Experienced in developing CI/CD pipelines
- Experienced in leveraging event brokers (Kafka /Event Hubs / Kinesis) to integrate with data sources and data consumers
- Understanding of basic networking concepts
- Familiar with Agile Software Development methodologies (Scrum)
Additional skills the edge you have
- Understanding of stream processing challenges and familiarity with Spark Structured Streaming
- Experience with IaC (Terraform Bicep or other)
- Experience running containerized applications (Azure Container Apps Kubernetes)
- Experience building event sourcing solutions
- Familiarity with platforms for change data capture (e.g. Debezium)
- Knowledge of Azure cloud native solutions (e.g. Azure Data Factory Azure Function App Azure Container Instances)
Additional Information :
Our offer professional development personal growth:
- Flexible employment and remote work
- International projects with leading global clients
- International business trips
- Non-corporate atmosphere
- Language classes
- Internal & external training
- Private healthcare and insurance
- Multisport card
- Well-being initiatives
Position at: Software Mind Poland
This role requires candidates to be based in Poland.
Remote Work :
Yes
Employment Type :
Full-time
View more
View less