GCP Data Engineer | Java & PySpark
Job Summary
Job Summary
Synechron is seeking a highly skilled GCP Data Engineer with expertise in Java and PySpark to lead the development of scalable data pipelines and microservices on Google Cloud this role you will collaborate with cross-functional teams to design implement and optimize data-driven solutions that support our global marketing and personalization initiatives. Your technical leadership will enable innovative cloud-native architectures that enhance business capabilities and drive operational efficiency.
Software Requirements
Required Software Proficiency:
Java (preferably version 8 or newer) extensive experience in developing backend services using Java and Spring Boot
PySpark proven experience in large-scale data processing within GCP environments
SQL hands-on experience with relational databases and query optimization techniques
Google Cloud Platform (GCP): DataProc BigQuery GKE Cloud Storage solid working knowledge of deploying and managing data systems on GCP
UNIX/Linux shell scripting Python Perl practical knowledge in scripting for automation and data manipulation
RESTful web services experience in designing and implementing APIs for data exchange
Version control: Git strong familiarity with code management and collaboration workflows
Preferred Software Skills:
GCP-native tools: Dataflow Cloud Composer Cloud Storage experience with fully managed cloud data workflows
Hadoop Hive or additional big data processing tools for data migration and legacy system integration
Machine learning libraries: TensorFlow Scikit-learn knowledge for implementing AI/ML features
Data streaming tools: Kafka experience with real-time data pipelines and event processing
Overall Responsibilities
Design develop and optimize large-scale data pipelines leveraging PySpark DataProc and BigQuery ensuring robustness and scalability
Build and maintain microservices and APIs using Java (Spring Boot) deployed on GKE (Google Kubernetes Engine)
Collaborate with product teams data scientists and stakeholders to analyze data requirements and translate into technical solutions
Modernize architecture by migrating workloads from Hadoop Spark and Hive to GCP cloud infrastructure
Conduct performance tuning troubleshooting and system optimization for data pipelines and services
Write detailed technical documentation including architecture diagrams APIs and deployment procedures
Stay informed on emerging trends in cloud data engineering AI/ML and big data to apply innovations effectively
Lead code reviews enforce coding standards and foster best practices across the team
Technical Skills (By Category)
Programming Languages:
Essential: Java (Spring Boot) PySpark SQL proven ability to develop scalable backend and data processing solutions
Preferred: Shell scripting Perl Python for automation and scripting tasks
Databases/Data Management:
BigQuery DataProc Hadoop Hive experience in designing managing and optimizing large datasets and query performance
Cloud Technologies:
GCP (DataProc BigQuery GKE Cloud Storage) thorough knowledge of cloud deployment data integration and management
Frameworks & Libraries:
PySpark Dataflow for data processing; REST API development with Spring Boot
Development Tools & Methods:
Git CI/CD pipelines (Jenkins Cloud Build) Agile/scrum practices for collaboration and continuous deployment
Security & Data Governance:
Understanding of data security privacy standards and best practices for cloud-based data platforms
Experience Requirements
4 years of professional experience in data engineering software development or related roles preferably on cloud platforms
Demonstrated experience developing and deploying scalable data pipelines and microservices in GCP environment
Proven expertise in Java and PySpark for large data processing and system integration
Familiarity with cloud migration projects big data frameworks and AI/ML libraries is a plus
Effective in collaborative agile team environments with a focus on delivering results
Day-to-Day Activities
Develop test and refine data pipelines using PySpark DataProc and BigQuery
Design and implement scalable back-end microservices and APIs with Java Spring Boot deployed on GKE
Participate in Agile ceremonies including sprint planning stand-ups and reviews
Monitor system performance troubleshoot issues and optimize data workflows for efficiency and reliability
Collaborate with data scientists product managers and analysts to deliver data-driven features
Document system architecture data models and deployment instructions to ensure maintainability
Continuously explore new cloud big data and AI/ML tools to improve infrastructure and solutions
Qualifications
Bachelors or Masters degree in Computer Science Data Science Engineering or a related field
4 years of experience in data engineering big data processing or software development with strong cloud expertise
Certifications such as GCP Professional Data Engineer are advantageous
Proven ability to design scalable solutions and lead data migration projects on cloud platforms
Professional Competencies
Strong analytical and problem-solving skills for designing optimized data pipelines
Leadership capabilities to manage project goals and mentor junior team members
Excellent communication skills for translating technical solutions to stakeholders
Adaptability and eagerness to explore cutting-edge technologies and methodologies
Results-driven with a focus on delivering reliable scalable and innovative data solutions
Effective time management to prioritize workload and meet project deadlines
SYNECHRONS DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity Equity and Inclusion (DEI) initiative Same Difference is committed to fostering an inclusive culture promoting equality diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger successful businesses as a global company. We encourage applicants from across diverse backgrounds race ethnicities religion age marital status gender sexual orientations or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements mentoring internal mobility learning and development programs and more.
All employment decisions at Synechron are based on business needs job requirements and individual qualifications without regard to the applicants gender gender identity sexual orientation race ethnicity disabled or veteran status or any other characteristic protected by law.
Required Experience:
IC
About Company
Chez Synechron, nous croyons en la puissance du numérique pour transformer les entreprises en mieux. Notre cabinet de conseil mondial combine la créativité et la technologie innovante pour offrir des solutions numériques de premier plan. Les technologies progressistes et les stratégie ... View more