DescriptionPosition: Data Engineer (Databricks & AWS)
Company Overview Citco is a global leader in financial services delivering innovative solutions to some of the worlds largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Data Engineer with strong Databricks expertise and AWS experience to contribute to mission-critical data initiatives.
Role Summary as a Data Engineer you will be responsible for developing and maintaining end-to-end data solutions on Databricks (Spark Delta Lake MLflow etc.) while working with core AWS services (S3 Glue Lambda etc.). You will work within a technical team implementing best practices in performance security and scalability. This role requires solid understanding of Databricks and experience with cloud-based data platforms.
Key Responsibilities
Platform & Development
- Implement Databricks Lakehouse solutions using Delta Lake for ACID transactions and data versioning
- Utilize Databricks SQL Analytics for querying and report generation
Support cluster management and Spark job optimization - Develop structured streaming pipelines for data ingestion and processing
- Use Databricks Repos notebooks and job scheduling for development workflows
Integration
- Work with Databricks and AWS S3 integration for data lake storage
- Build ETL/ELT pipelines using AWS Glue catalog AWS Lambda and AWS Step Functions
- Configure networking settings for secure data access
- Support infrastructure deployment using AWS CloudFormation or Terraform
& Workflow Development
- Create scalable ETL frameworks using Spark (Python/Scala)
- Participate in workflow orchestration and CI/CD implementation
- Develop Delta Live Tables for data ingestion and transformations
- Support MLflow integration for data lineage and reproducibility
& Optimization
- Implement Spark job optimizations (caching partitioning joins)
- Support cluster configuration for optimal performance
- Optimize data processing for large-scale datasets
& Governance
- Apply Unity Catalog features for governance and access control
- Follow compliance requirements and security policies
- Implement IAM best practices
- Participate in code reviews and knowledge-sharing sessions
- Work within Agile/Scrum development framework
- Collaborate with team members and stakeholders
& Maintenance
- Help implement monitoring solutions for pipeline performance
- Support alert system setup and maintenance
- Ensure data quality and reliability standards
Qualifications
Background
- Bachelors degree in Computer Science Data Science Engineering or equivalent experience
Experience
- Databricks Experience: 2 years of hands-on Databricks (Spark) experience
- AWS Knowledge: Experience with AWS S3 Glue Lambda and basic security practices
- Programming Skills: Strong proficiency in Python (PySpark) and SQL
- Data Warehousing: Understanding of RDBMS and data modeling concepts
- Infrastructure: Familiarity with infrastructure as code concepts
#LI-AD2
DescriptionPosition: Data Engineer (Databricks & AWS)Company Overview Citco is a global leader in financial services delivering innovative solutions to some of the worlds largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are ...
DescriptionPosition: Data Engineer (Databricks & AWS)
Company Overview Citco is a global leader in financial services delivering innovative solutions to some of the worlds largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Data Engineer with strong Databricks expertise and AWS experience to contribute to mission-critical data initiatives.
Role Summary as a Data Engineer you will be responsible for developing and maintaining end-to-end data solutions on Databricks (Spark Delta Lake MLflow etc.) while working with core AWS services (S3 Glue Lambda etc.). You will work within a technical team implementing best practices in performance security and scalability. This role requires solid understanding of Databricks and experience with cloud-based data platforms.
Key Responsibilities
Platform & Development
- Implement Databricks Lakehouse solutions using Delta Lake for ACID transactions and data versioning
- Utilize Databricks SQL Analytics for querying and report generation
Support cluster management and Spark job optimization - Develop structured streaming pipelines for data ingestion and processing
- Use Databricks Repos notebooks and job scheduling for development workflows
Integration
- Work with Databricks and AWS S3 integration for data lake storage
- Build ETL/ELT pipelines using AWS Glue catalog AWS Lambda and AWS Step Functions
- Configure networking settings for secure data access
- Support infrastructure deployment using AWS CloudFormation or Terraform
& Workflow Development
- Create scalable ETL frameworks using Spark (Python/Scala)
- Participate in workflow orchestration and CI/CD implementation
- Develop Delta Live Tables for data ingestion and transformations
- Support MLflow integration for data lineage and reproducibility
& Optimization
- Implement Spark job optimizations (caching partitioning joins)
- Support cluster configuration for optimal performance
- Optimize data processing for large-scale datasets
& Governance
- Apply Unity Catalog features for governance and access control
- Follow compliance requirements and security policies
- Implement IAM best practices
- Participate in code reviews and knowledge-sharing sessions
- Work within Agile/Scrum development framework
- Collaborate with team members and stakeholders
& Maintenance
- Help implement monitoring solutions for pipeline performance
- Support alert system setup and maintenance
- Ensure data quality and reliability standards
Qualifications
Background
- Bachelors degree in Computer Science Data Science Engineering or equivalent experience
Experience
- Databricks Experience: 2 years of hands-on Databricks (Spark) experience
- AWS Knowledge: Experience with AWS S3 Glue Lambda and basic security practices
- Programming Skills: Strong proficiency in Python (PySpark) and SQL
- Data Warehousing: Understanding of RDBMS and data modeling concepts
- Infrastructure: Familiarity with infrastructure as code concepts
#LI-AD2
View more
View less