About the Role
We are looking for a highly skilled Senior Software Engineer with deep expertise in Apache Spark and distributed data processing this role you will work directly on Apache Spark internals contribute upstream improvements to the Spark open-source community and adapt Spark capabilities to support DataPelagos product requirements.
This position demands strong technical ownership independent execution and the ability to drive high-impact engineering initiatives in a fast-paced environment.
Key Responsibilities
- Contribute code fixes and enhancements directly to the Apache Spark open-source project.
- Upgrade and maintain compatibility with newer Apache Spark community releases.
- Analyze modify and optimize Spark internals to support DataPelagos platform requirements.
- Design and implement scalable distributed data processing solutions.
- Debug and resolve complex performance stability and scalability issues within Spark-based systems.
- Collaborate with product and platform teams to align Spark capabilities with business and technical objectives.
- Drive architecture discussions and provide technical leadership across distributed systems initiatives.
- Ensure high engineering standards through code reviews testing documentation and best practices.
- Work independently with minimal supervision while delivering high-quality outcomes.
Required Qualifications
- Strong experience with Apache Spark internals and distributed computing systems.
- Proven experience contributing to open-source projects preferably Apache Spark or related Apache ecosystem technologies.
- Expertise in Java and/or Scala programming.
- Strong understanding of query execution distributed processing memory management and performance optimization.
- Experience upgrading and maintaining large-scale Spark deployments.
- Deep knowledge of big data technologies and distributed systems architecture.
- Strong debugging problem-solving and performance tuning skills.
- Ability to work autonomously and lead technically challenging initiatives.
Preferred Qualifications
- Experience with query engines vectorized execution or data processing frameworks.
- Familiarity with Kubernetes cloud-native environments and large-scale infrastructure.
- Knowledge of JVM performance tuning and low-level system optimization.
- Prior experience working closely with open-source communities.
What We Expect
- Senior-level ownership and accountability.
- High independence with the ability to make impactful technical decisions.
- Strong communication and collaboration skills.
- Passion for open-source software and distributed data technologies.
Nice to Have
- Active GitHub or Apache contributor profile.
- Experience with large-scale analytics or database systems.
- Publications talks or community involvement in big data technologies.
Required Experience:
IC
About the RoleWe are looking for a highly skilled Senior Software Engineer with deep expertise in Apache Spark and distributed data processing this role you will work directly on Apache Spark internals contribute upstream improvements to the Spark open-source community and adapt Spark capabilities ...
About the Role
We are looking for a highly skilled Senior Software Engineer with deep expertise in Apache Spark and distributed data processing this role you will work directly on Apache Spark internals contribute upstream improvements to the Spark open-source community and adapt Spark capabilities to support DataPelagos product requirements.
This position demands strong technical ownership independent execution and the ability to drive high-impact engineering initiatives in a fast-paced environment.
Key Responsibilities
- Contribute code fixes and enhancements directly to the Apache Spark open-source project.
- Upgrade and maintain compatibility with newer Apache Spark community releases.
- Analyze modify and optimize Spark internals to support DataPelagos platform requirements.
- Design and implement scalable distributed data processing solutions.
- Debug and resolve complex performance stability and scalability issues within Spark-based systems.
- Collaborate with product and platform teams to align Spark capabilities with business and technical objectives.
- Drive architecture discussions and provide technical leadership across distributed systems initiatives.
- Ensure high engineering standards through code reviews testing documentation and best practices.
- Work independently with minimal supervision while delivering high-quality outcomes.
Required Qualifications
- Strong experience with Apache Spark internals and distributed computing systems.
- Proven experience contributing to open-source projects preferably Apache Spark or related Apache ecosystem technologies.
- Expertise in Java and/or Scala programming.
- Strong understanding of query execution distributed processing memory management and performance optimization.
- Experience upgrading and maintaining large-scale Spark deployments.
- Deep knowledge of big data technologies and distributed systems architecture.
- Strong debugging problem-solving and performance tuning skills.
- Ability to work autonomously and lead technically challenging initiatives.
Preferred Qualifications
- Experience with query engines vectorized execution or data processing frameworks.
- Familiarity with Kubernetes cloud-native environments and large-scale infrastructure.
- Knowledge of JVM performance tuning and low-level system optimization.
- Prior experience working closely with open-source communities.
What We Expect
- Senior-level ownership and accountability.
- High independence with the ability to make impactful technical decisions.
- Strong communication and collaboration skills.
- Passion for open-source software and distributed data technologies.
Nice to Have
- Active GitHub or Apache contributor profile.
- Experience with large-scale analytics or database systems.
- Publications talks or community involvement in big data technologies.
Required Experience:
IC
View more
View less