Job Details:
JD for Cloud Engineer with Spark.
- Seeking a skilled Engineer to join our team with expertise in RedHat OpenShift Google Cloud (GCP) and Apache Spark.
- This role focuses on designing deploying and managing scalable data processing solutions in a cloud-native environment.
- You will work closely with data scientists software engineers and DevOps team to ensure robust high-performance data pipelines and analytics platforms.
Responsibilities:
- Platform Management: Deploy configure and maintain OpenShift clusters or GCP projects to support containerized Spark Applications
- Data Pipeline Development: Design and implement large-scale data processing workflows using Apache Spark
- Optimization: Tune Spark jobs for performance leveraging OpenShifts resource management capabilities(e.g. Kubernetes orchestration auto-scaling).
- Integration: Integrate spark with other data sources (e.g. Kafka s3 cloud storage) and sinks (e.g. databases data lakes)
- CI/CD Implementation: Build and maintain CI/CD pipelines for deploying Spark application in OpenShift or GCP using tools like GitHub actions Sonar Harness.
- Monitoring & Troubleshooting: Monitor cluster health Spark job performance and resource utilization using OpenShift tools (e.g. Prometheus Grafana) and resolve issues proactively
- Security: Ensure compliance with security standards Implementing role-based access control(RBAC) and encryption for data in transit and at rest.
- Collaboration: Work with cross-functional teams to define requirements Architect solutions and support production deployments.
Qualifications:
- 5 years working on Apache Spark for big data processing.
- 3 years of Django development experience.
- 2 years of creating and maintaining conda environments.
- 4 years managing containerized environments with OpenShift or Kubernetes
Technical Skills:
- Proficiency in Spark frameworks(Python/PySpark Scala or Java)
- Hands-on experience with OpenShift administration (e.g. cluster setup networking storage)
- Proficiency in creating and maintaining Conda environments and dependencies
- Familiarity with Docker and Kubernetes concerts(e.g. pods deployments services and images)
- Knowledge of distributed systems cloud platforms(AWS GCP Azure) and data storage solutions (e.g. S3 HDFS)
- Programming: Strong coding skills in Python Scala or Java; experience with shell scripting is a plus.
- Tools: Experience with Git Actions Helm Harness and CI/CD tools.
- Problem-Solving: Ability to debug complex issues across distributed systems and optimize resource usage.
Education:
- Bachelors degree in Computer Science Engineering or related filed.
Job Details: JD for Cloud Engineer with Spark. Seeking a skilled Engineer to join our team with expertise in RedHat OpenShift Google Cloud (GCP) and Apache Spark. This role focuses on designing deploying and managing scalable data processing solutions in a cloud-native environment. You will work cl...
Job Details:
JD for Cloud Engineer with Spark.
- Seeking a skilled Engineer to join our team with expertise in RedHat OpenShift Google Cloud (GCP) and Apache Spark.
- This role focuses on designing deploying and managing scalable data processing solutions in a cloud-native environment.
- You will work closely with data scientists software engineers and DevOps team to ensure robust high-performance data pipelines and analytics platforms.
Responsibilities:
- Platform Management: Deploy configure and maintain OpenShift clusters or GCP projects to support containerized Spark Applications
- Data Pipeline Development: Design and implement large-scale data processing workflows using Apache Spark
- Optimization: Tune Spark jobs for performance leveraging OpenShifts resource management capabilities(e.g. Kubernetes orchestration auto-scaling).
- Integration: Integrate spark with other data sources (e.g. Kafka s3 cloud storage) and sinks (e.g. databases data lakes)
- CI/CD Implementation: Build and maintain CI/CD pipelines for deploying Spark application in OpenShift or GCP using tools like GitHub actions Sonar Harness.
- Monitoring & Troubleshooting: Monitor cluster health Spark job performance and resource utilization using OpenShift tools (e.g. Prometheus Grafana) and resolve issues proactively
- Security: Ensure compliance with security standards Implementing role-based access control(RBAC) and encryption for data in transit and at rest.
- Collaboration: Work with cross-functional teams to define requirements Architect solutions and support production deployments.
Qualifications:
- 5 years working on Apache Spark for big data processing.
- 3 years of Django development experience.
- 2 years of creating and maintaining conda environments.
- 4 years managing containerized environments with OpenShift or Kubernetes
Technical Skills:
- Proficiency in Spark frameworks(Python/PySpark Scala or Java)
- Hands-on experience with OpenShift administration (e.g. cluster setup networking storage)
- Proficiency in creating and maintaining Conda environments and dependencies
- Familiarity with Docker and Kubernetes concerts(e.g. pods deployments services and images)
- Knowledge of distributed systems cloud platforms(AWS GCP Azure) and data storage solutions (e.g. S3 HDFS)
- Programming: Strong coding skills in Python Scala or Java; experience with shell scripting is a plus.
- Tools: Experience with Git Actions Helm Harness and CI/CD tools.
- Problem-Solving: Ability to debug complex issues across distributed systems and optimize resource usage.
Education:
- Bachelors degree in Computer Science Engineering or related filed.
View more
View less