- Design develop and maintain end-to-end big data pipelines that are optimized scalable and capable of processing large volumes of data in real-time and in batch mode
- Collaborate closely with cross-functional stakeholders to gather requirements and deliver high-quality data solutions that align with business goals
- Implement data transformation and integration processes using modern big data frameworks and cloud platforms
- Build and maintain data models data warehouses and schema designs to support analytics and reporting needs
- Ensure data quality reliability and performance by implementing robust testing monitoring and alerting practices
- Contribute to architectural decisions for distributed data systems and help optimize performance for high-load environments
- Ensure compliance with data security and governance standards
Qualifications :
- 4 years of experience in data engineering big data architecture or related fields
- Strong proficiency in Python and PySpark
- Advanced knowledge of SQL (query optimization complex joins window functions) and experience with NoSQL databases
- Solid understanding of distributed computing principles and hands-on experience with technologies such as Apache Spark Kafka Hadoop Presto or Databricks
- Experience designing and managing data warehouses and data lake architectures in cloud environments (AWS GCP or Azure)
- Familiarity with data modeling schema design and performance tuning for large datasets
- Strong understanding of DevOps practices for automating deployment monitoring and scaling big data applications (e.g. CI/CD pipelines)
- At least an Upper-Intermediate level of English
Additional Information :
PERSONAL PROFILE
- Excellent communication skills and the ability to work effectively in cross-cultural and cross-functional teams
Remote Work :
Yes
Employment Type :
Full-time
Design develop and maintain end-to-end big data pipelines that are optimized scalable and capable of processing large volumes of data in real-time and in batch mode Collaborate closely with cross-functional stakeholders to gather requirements and deliver high-quality data solutions that align with b...
- Design develop and maintain end-to-end big data pipelines that are optimized scalable and capable of processing large volumes of data in real-time and in batch mode
- Collaborate closely with cross-functional stakeholders to gather requirements and deliver high-quality data solutions that align with business goals
- Implement data transformation and integration processes using modern big data frameworks and cloud platforms
- Build and maintain data models data warehouses and schema designs to support analytics and reporting needs
- Ensure data quality reliability and performance by implementing robust testing monitoring and alerting practices
- Contribute to architectural decisions for distributed data systems and help optimize performance for high-load environments
- Ensure compliance with data security and governance standards
Qualifications :
- 4 years of experience in data engineering big data architecture or related fields
- Strong proficiency in Python and PySpark
- Advanced knowledge of SQL (query optimization complex joins window functions) and experience with NoSQL databases
- Solid understanding of distributed computing principles and hands-on experience with technologies such as Apache Spark Kafka Hadoop Presto or Databricks
- Experience designing and managing data warehouses and data lake architectures in cloud environments (AWS GCP or Azure)
- Familiarity with data modeling schema design and performance tuning for large datasets
- Strong understanding of DevOps practices for automating deployment monitoring and scaling big data applications (e.g. CI/CD pipelines)
- At least an Upper-Intermediate level of English
Additional Information :
PERSONAL PROFILE
- Excellent communication skills and the ability to work effectively in cross-cultural and cross-functional teams
Remote Work :
Yes
Employment Type :
Full-time
View more
View less