Senior Data Engineer Spark and GCP

Decision Minds

Posted on : 13-04-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

San Jose, CA - USA

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 13-04-2025

Job Description

Job Summary:

We are looking for an experienced Senior Data Engineer with a strong background in Google Cloud Platform (GCP) and Apache Spark to join our dynamic team. You will be responsible for designing building and optimizing scalable data pipelines leveraging GCP services and Spark to handle largescale data processing and analytics. You will play a key role in shaping the architecture of our data platform and work closely with crossfunctional teams to enable datadriven decisionmaking.

Key Responsibilities:

Design & Build Scalable Data Pipelines: Architect build and optimize highly efficient data pipelines using Apache Spark on Google Cloud Platform (GCP)(e.g. BigQuery Dataflow Dataproc Pub/Sub etc..
Data Processing & Transformation: Work with large volumes of structured and unstructured data developing data processing and transformation workflows that support business intelligence and analytics use cases.
Collaborate with CrossFunctional Teams: Work closely with Data Scientists Business Intelligence teams and Product teams to understand business requirements and deliver scalable data solutions.
Big Data Engineering: Utilize Spark to process and analyze large datasets in distributed computing environments ensuring data processing tasks are efficient and scalable.
Optimize Performance & Cost Efficiency: Finetune the performance of data workflows and reduce processing costs through the effective use of GCP services and Spark performance optimizations (e.g. partitioning caching memory management).
Cloud Infrastructure Management: Manage and optimize cloud resources in GCP ensuring high availability scalability and reliability of data pipelines and processing jobs.
ETL & Data Integration: Design and implement complex ETL workflows including data extraction transformation and loading from multiple source systems into cloudbased data warehouses or data lakes.
Data Quality & Governance: Ensure data quality and consistency across pipelines and adhere to data governance security and privacy standards.
Mentorship & Leadership:Provide technical leadership and mentorship to junior data engineers and foster a culture of best practices in data engineering.
Monitoring & Troubleshooting: Implement monitoring solutions to track pipeline performance set up alerting for failures and troubleshoot any issues in the data processing workflows.
Documentation & Reporting: Create detailed technical documentation and reports to communicate data pipeline designs performance metrics and optimizations to stakeholders.

Skills & Qualifications:

Proven Experience: 5 years of handson experience in data engineering with strong expertise in Google Cloud Platform (GCP) and Apache Spark.
GCP Services Expertise: Experience with GCP services such as BigQuery Dataflow Dataproc Pub/Sub Cloud Storage Cloud Composer and Cloud Functions.
Big Data Technologies: Proficiency in working with Apache Spark(PySpark Scala or Java) Hadoop and Kafka for building distributed data processing pipelines.
ETL Process Design: Expertise in designingand implementing complex ETL workflows and understanding of data ingestion transformation and storage.
Programming Skills: Strong programming skills in Python Scala or Java with handson experience in big data frameworks (e.g. Apache Spark).
SQL & NoSQL Databases: Expertise in SQL(BigQuery PostgreSQL etc. and knowledge of NoSQL databases (e.g. MongoDB Cassandra).
Data Warehousing: Experience building and managing data warehouses especially using BigQuery or similar cloudbased storage systems.
Performance Optimization: Expertise in optimizing Spark jobs and cloudbased data workflows for performance scalability and cost efficiency.
Cloud Infrastructure Management: Familiarity with cloudnative DevOps practices containerization (e.g. Docker) and CI/CD pipelines.
Data Governance & Security: Strong knowledge of data privacy governance and security best practices in cloud environments.
Version Control & Collaboration: Proficientin usingversion control tools (e.g. Git) and agile development practices.
Education: Bachelors or Masters degree inComputer Science Engineering Data Science or a related field. Certifications in GCP (e.g. Google Cloud Professional Data Engineer) are a plus.

Preferred Qualifications:

RealTime Data Processing: Knowledge of realtime data processing tools such as Apache Kafka or Google Pub/Sub.

Personal Attributes:

Leadership: Strong leadership skills with a track record of leading data engineering teams and driving initiatives that improve data workflows.
ProblemSolving: Excellent analytical and problemsolving skills particularly in distributed computing and largescale data processing.
Collaboration: Effective communicator who can collaborate with technical and nontechnical stakeholders.
Adaptability: Ability to thrive in a fastpaced constantly evolving environment and embrace new technologies.
Mentorship: Passion for coaching and mentoring junior team members to develop their technical skills.

Why Join Us:

Innovative Work Environment: Join a team working with cuttingedge technologies to build scalable data solutions.
Career Growth: Opportunities to expand your expertise in GCP and Spark and work on exciting and complex data engineering projects.
Competitive Compensation: Attractive salary benefits and opportunities for career advancement.

Required Experience:

Senior IC

Employment Type

Full-Time

Company Industry

Key Skills

Apply Now

About Company

Decision Minds

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Senior Data Engineer Spark and GCP

Decision Minds

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Senior Data Engineer

GCP Sr. Auditor

Senior Data and Digital Transformation Manager

Project Manager : Data Engineer and Analytics

Data Engineer

Senior Civil Engineer

Senior Test Engineer

Senior Project Manager - Data Center Construction