Senior Data Engineer (Scala Apache Spark Databricks)
Senior Data Engineer that will be instrumental in designing developing and optimizing our next-generation data pipelines and analytics solutions leveraging the power of Scala Apache Spark and Databricks. You will work on complex data challenges contributing to a scalable and robust data architecture that drives critical business insights.
- Design develop and maintain robust scalable and efficient ETL/ELT pipelines using Apache Spark primarily with Scala.
- Develop and optimize data processing jobs within the Databricks platform utilizing notebooks Delta Lake and other Databricks features.
- Collaborate with data scientists analysts and other engineering teams to understand data requirements and translate them into technical solutions.
- Implement data governance quality and security best practices within the data platform.
- Optimize existing Spark jobs and Databricks workflows for performance cost-efficiency and reliability.
- Troubleshoot and resolve complex data-related issues ensuring data integrity and availability.
- Participate in code reviews promote best practices and mentor junior team members.
- Stay up-to-date with the latest advancements in big data technologies particularly within the Spark and Databricks ecosystem.
- Contribute to the overall data architecture strategy and roadmap.
Experience:
- 7 years of professional experience as a Data Engineer or Software Engineer with a strong focus on data.
- Expert-level proficiency in Scala for data processing and application development.
- Extensive experience with Apache Spark (Spark SQL Spark Streaming Spark Core) for large-scale data manipulation and transformation.
- Deep hands-on experience with Databricks platform features including notebooks Delta Lake Unity Catalog Jobs and cluster management.
- Solid understanding of distributed systems and big data architectural patterns.
- Proficiency in SQL for data querying and manipulation.
- Experience with cloud platforms (AWS Azure GCP) and their data-related services (e.g. S3 ADLS GCS).
- Familiarity with data warehousing concepts and dimensional modeling.
- Experience with version control systems (e.g. Git).
- Strong problem-solving skills and the ability to work independently and as part of a team.
- Excellent communication and collaboration skills.
- Experience with real-time data processing (e.g. Kafka Kinesis).
- Knowledge of other programming languages (e.g. Python).
- Experience with CI/CD pipelines for data solutions.
- Familiarity with data governance tools and principles.
- Contributions to open-source projects.
Required Skills : Data Warehouse
Basic Qualification :
Additional Skills :
Background Check : Yes
Drug Screen : No