drjobs Data Engineer - Spark Migration

Data Engineer - Spark Migration

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Hyderabad - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Location: Hyderabad  Mode: Hybrid  Experience: 5 years  Role: Individual Contributor

We are hiring a Senior Data Engineer to support a critical data modernization initiative for a US-based global bank. The role focuses on migrating ETL workloads from legacy platforms (Ab Initio) to Apache Spark on Google Cloud. This is an IC-level position requiring strong hands-on skills in Spark development schema mapping and client delivery in a fast-paced agile setting.

Must-Have Skills

Skill

Skill Depth

Apache Spark (via Python)

Must have developed Spark-based pipelines using Python including transformations (joins aggregations filters) schema evolution and partitioning. Should be capable of debugging Spark jobs and optimizing logic at the code level.

Python (ETL scripting)

Must have written ETL scripts for file ingestion (CSV JSON Parquet) transformation routines and validations. Scripts should follow modular design principles and include error handling and logging.

GCP – BigQuery GCS

Must have used BigQuery for structured queries and GCS for input/output file staging. Should understand dataset partitioning IAM roles and cost-aware design for GCP data services.

Schema Mapping & Validation

Should have contributed to schema-level field mapping transformation logic definition and validation of output data parity post-migration.

Client-Facing Delivery

Must have participated in requirement workshops solution walkthroughs and defect resolution. Should be capable of independently handling delivery documentation and client coordination.

Nice-to-Have Skills

Skill

Skill Depth

Ab Initio (read-only exposure)

Preferred if the candidate has reviewed Ab Initio graphs or mapping sheets to support migration logic recreation in Spark. Hands-on Ab Initio work is not required.

Airflow / Cloud Composer

Helpful if familiar with DAG creation and job orchestration using Airflow or GCP’s Composer. Should understand task dependencies and scheduling patterns.

GCP Dataflow / PubSub

Useful for teams dealing with real-time ingestion. Familiarity with Dataflow architecture and PubSub concepts is preferred but not essential.

Logging and Monitoring

Should have exposure to pipeline monitoring via structured logging log analyzers or GCP-native logging frameworks.

CI/CD for Data Pipelines

Awareness of deploying data jobs via Jenkins GitHub Actions or Cloud Build is a plus especially for projects involving frequent iteration.

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.