Greetings FULLTIME JOB
Role: Ab Initio Developer
Location: Tampa FL (3 Days onsite/ week)
Job Description:
Sound Concepts of Large Datawarehouse/Data Lake Concepts ETL/ELT Ab Initio Apache Spark PySpark SQL
Oracle HADOOP
Advanced dimensional modeling data vault and schema design for largescale Data Warehouses and Data Lakes.
Deep expertise in ETL/ELT engineering using Ab Initio (graphs plans PDL metadatadriven design) and migration
of those patterns to Spark.
Handson PySpark/Spark proficiency for batch streaming joins windowing partitioning and performance tuning on
large datasets.
Strong command of Hadoop ecosystem components: HDFS Hive YARN Oozie/Airflow Ranger Atlas and
security/governance frameworks.
Oracle SQL mastery including performance tuning partitioning materialized views and implementing/decoding
Virtual Private Database (VPD) policies.
Data ingestion architecture using CDC Kafka filebased ingestion and incremental load frameworks for highvolume
HR and financial data.
Data quality engineering: reconciliation frameworks validation rules audit controls lineage and automated
regression testing.
Cloud and lakehouse engineering on Databricks: Delta Lake Unity Catalog cluster optimization job orchestration
and CI/CD.
Metadatadriven pipeline design reusable transformation frameworks and parameterized job orchestration patterns.
Performance engineering across platforms: skew mitigation partition strategy broadcast vs shuffle decisions and
storage format optimization (Parquet/ORC/Delta).
With Regards
Sai Yashwanth
IT Recruiter
Greetings FULLTIME JOB Role: Ab Initio Developer Location: Tampa FL (3 Days onsite/ week) Job Description: Sound Concepts of Large Datawarehouse/Data Lake Concepts ETL/ELT Ab Initio Apache Spark PySpark SQL Oracle HADOOP Advanced dimensional modeling data vault and schema design for large...
Greetings FULLTIME JOB
Role: Ab Initio Developer
Location: Tampa FL (3 Days onsite/ week)
Job Description:
Sound Concepts of Large Datawarehouse/Data Lake Concepts ETL/ELT Ab Initio Apache Spark PySpark SQL
Oracle HADOOP
Advanced dimensional modeling data vault and schema design for largescale Data Warehouses and Data Lakes.
Deep expertise in ETL/ELT engineering using Ab Initio (graphs plans PDL metadatadriven design) and migration
of those patterns to Spark.
Handson PySpark/Spark proficiency for batch streaming joins windowing partitioning and performance tuning on
large datasets.
Strong command of Hadoop ecosystem components: HDFS Hive YARN Oozie/Airflow Ranger Atlas and
security/governance frameworks.
Oracle SQL mastery including performance tuning partitioning materialized views and implementing/decoding
Virtual Private Database (VPD) policies.
Data ingestion architecture using CDC Kafka filebased ingestion and incremental load frameworks for highvolume
HR and financial data.
Data quality engineering: reconciliation frameworks validation rules audit controls lineage and automated
regression testing.
Cloud and lakehouse engineering on Databricks: Delta Lake Unity Catalog cluster optimization job orchestration
and CI/CD.
Metadatadriven pipeline design reusable transformation frameworks and parameterized job orchestration patterns.
Performance engineering across platforms: skew mitigation partition strategy broadcast vs shuffle decisions and
storage format optimization (Parquet/ORC/Delta).
With Regards
Sai Yashwanth
IT Recruiter
View more
View less