drjobs Data Engineer with GCP, Databricks & PySpark Expertise

Data Engineer with GCP, Databricks & PySpark Expertise

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Pune - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Summary

Synechron is seeking a skilled Data Engineer experienced in Google Cloud Platform (GCP) Databricks PySpark and this role you will design develop and maintain scalable data pipelines and workflows to enable advanced analytics and business intelligence solutions. You will work within a collaborative environment to integrate diverse data sources optimize data processing workflows and ensure data quality and availability. Your contributions will support strategic decision-making and enhance the organizations data-driven initiatives.

Software Requirements

Required Skills:

  • Hands-on experience with GCP services specifically BigQuery Cloud Storage and Composer for data pipeline orchestration

  • Proficiency in Databricks platform with PySpark for building and optimizing large-scale ETL/ELT processes

  • Expertise in writing and tuning complex SQL queries for data transformation aggregation and reporting on large datasets

  • Experience integrating data from multiple sources such as APIs cloud storage and databases into a central data warehouse

  • Familiarity with workflow orchestration tools like Apache Airflow or Cloud Composer for scheduling monitoring and managing data jobs

  • Knowledge of version control systems (Git) CI/CD practices and Agile development methodologies

Preferred Skills:

  • Experience with other cloud platforms (AWS Azure) or additional GCP services (Dataflow Pub/Sub)

  • Knowledge of data modeling and data governance best practices

  • Familiarity with containerization tools like Docker or Kubernetes

Overall Responsibilities

  • Design develop and maintain scalable data pipelines using GCP Databricks and associated tools

  • Write efficient well-documented SQL queries to support data transformation data quality and reporting needs

  • Integrate data from diverse sources including APIs cloud storage and databases to create a reliable central data repository

  • Develop automated workflows and schedules for data processing tasks utilizing Composer or Airflow

  • Collaborate with data analysts data scientists and business stakeholders to understand data requirements and deliver solutions

  • Monitor troubleshoot and optimize data pipelines for performance scalability and reliability

  • Maintain data security privacy standards and documentation compliance

  • Stay informed about emerging data engineering technologies and apply them effectively to improve workflows

Technical Skills (By Category)

  • Programming Languages:

    • Required: PySpark (Python in Databricks) SQL

    • Preferred: Python Java or Scala for custom data processing

  • Databases/Data Management:

    • Required: BigQuery relational databases large-scale data transformation and querying

    • Preferred: Data cataloging and governance tools

  • Cloud Technologies:

    • Required: GCP services including BigQuery Cloud Storage Composer

    • Preferred: Experience with other cloud services (AWS Azure)

  • Frameworks and Libraries:

    • Required: Databricks with PySpark Airflow or Cloud Composer

    • Preferred: Data processing frameworks such as Apache Beam Dataflow

  • Development Tools and Methodologies:

    • Version control using Git

    • CI/CD pipelines for automated deployment and testing

    • Agile development practices

  • Security & Compliance:

    • Knowledge of data security best practices access controls and data privacy regulations

Experience Requirements

  • Minimum of 3 years of professional experience in data engineering or a related role

  • Proven expertise in designing and implementing large-scale data pipelines using GCP and Databricks

  • Hands-on experience with complex SQL query development and optimization

  • Working knowledge of workflow orchestration tools such as Airflow or Cloud Composer

  • Experience processing data from multiple sources including APIs and cloud storage solutions

  • Experience in an Agile environment preferred

Alternative pathways:
Candidates with strong data pipeline experience on other cloud platforms who are willing to adapt and learn GCP services may be considered.

Day-to-Day Activities

  • Develop test and deploy data pipelines that facilitate analytics reporting and data science initiatives

  • Collaborate with cross-functional teams during sprint planning stand-ups and code reviews

  • Monitor scheduled jobs for successful execution troubleshoot failures and optimize performance

  • Document processes workflows and data sources in compliance with organizational standards

  • Continuously review pipeline performance implement improvements and ensure robustness

  • Participate in scalable architecture design discussions and recommend best practices

Qualifications

  • Bachelors degree in Computer Science Data Science Information Technology or equivalent field

  • At least 3 years of experience in data engineering data architecture or related roles

  • Demonstrated expertise with GCP Databricks SQL and workflow orchestration tools

Certifications (preferred):

  • GCP certifications such as Professional Data Engineer or equivalent

  • Databricks Data Engineer certification

Professional Competencies

  • Critical thinking and effective problem-solving skills related to large-scale data processing

  • Strong collaboration abilities across multidisciplinary teams and stakeholders

  • Excellent communication skills with the ability to translate technical details into clear insights

  • Adaptability to evolving technologies and project requirements

  • Ability to prioritize tasks manage time efficiently and deliver on deadlines

  • Innovative mindset with a focus on continuous learning and process improvement

SYNECHRONS DIVERSITY & INCLUSION STATEMENT

Diversity & Inclusion are fundamental to our culture and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity Equity and Inclusion (DEI) initiative Same Difference is committed to fostering an inclusive culture promoting equality diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger successful businesses as a global company. We encourage applicants from across diverse backgrounds race ethnicities religion age marital status gender sexual orientations or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements mentoring internal mobility learning and development programs and more.


All employment decisions at Synechron are based on business needs job requirements and individual qualifications without regard to the applicants gender gender identity sexual orientation race ethnicity disabled or veteran status or any other characteristic protected by law.

Candidate Application Notice

Employment Type

Full-Time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.