Ab Initio Developer

Tampa, FL - USA

Monthly Salary: Not Disclosed

Posted on: 10 hours ago

Vacancies: 1 Vacancy

Job Summary

Job Description:

Sound Concepts of Large Datawarehouse/Data Lake Concepts ETL/ELT Ab Initio Apache Spark PySpark SQL Oracle HADOOP

Advanced dimensional modeling data vault and schema design for large scale Data Warehouses and Data Lakes.

Deep expertise in ETL/ELT engineering using Ab Initio (graphs plans PDL metadata driven design) and migration of those patterns to Spark.

Hands on PySpark/Spark proficiency for batch streaming joins windowing partitioning and performance tuning on large datasets.

Strong command of Hadoop ecosystem components: HDFS Hive YARN Oozie/Airflow Ranger Atlas and security/governance frameworks.

Oracle SQL mastery including performance tuning partitioning materialized views and implementing/decoding Virtual Private Database (VPD) policies.

Data ingestion architecture using CDC Kafka file based ingestion and incremental load frameworks for high volume HR and financial data.

Data quality engineering: reconciliation frameworks validation rules audit controls lineage and automated regression testing.

Cloud and lakehouse engineering on Databricks: Delta Lake Unity Catalog cluster optimization job orchestration and CI/CD.

Metadata driven pipeline design reusable transformation frameworks and parameterized job orchestration patterns.

Performance engineering across platforms: skew mitigation partition strategy broadcast vs shuffle decisions and storage format optimization (Parquet/ORC/Delta).

Job Description: Sound Concepts of Large Datawarehouse/Data Lake Concepts ETL/ELT Ab Initio Apache Spark PySpark SQL Oracle HADOOP Advanced dimensional modeling data vault and schema design for large scale Data Warehouses and Data Lakes. Deep expertise in ETL/ELT engineering using Ab Initio (g...