Design and develop scalable data pipelines using Scala and Apache Spark
Build and optimize data processing frameworks on GCP (Dataproc Dataflow BigQuery)
Develop high-performance batch and real-time data solutions
Work with large-scale structured and semi-structured datasets
Optimize BigQuery performance partitioning and cost efficiency
Implement data ingestion frameworks from multiple sources (APIs streaming databases)
Collaborate with cross-functional teams including Data Scientists Analysts and Architects
Ensure data quality governance and security standards
Participate in code reviews performance tuning and troubleshooting
Support CI/CD implementation and deployment pipelines
Job Description: Design and develop scalable data pipelines using Scala and Apache Spark Build and optimize data processing frameworks on GCP (Dataproc Dataflow BigQuery) Develop high-performance batch and real-time data solutions Work with large-scale structured and semi-structured ...
Job Description:
Design and develop scalable data pipelines using Scala and Apache Spark
Build and optimize data processing frameworks on GCP (Dataproc Dataflow BigQuery)
Develop high-performance batch and real-time data solutions
Work with large-scale structured and semi-structured datasets
Optimize BigQuery performance partitioning and cost efficiency
Implement data ingestion frameworks from multiple sources (APIs streaming databases)
Collaborate with cross-functional teams including Data Scientists Analysts and Architects
Ensure data quality governance and security standards
Participate in code reviews performance tuning and troubleshooting
Support CI/CD implementation and deployment pipelines