Reliability Engineer Data Platforms

Apple

Not Interested
Bookmark
Report This Job

profile Job Location:

Hyderabad - India

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

As part of our team you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications such as analytics reporting and AI/ML apps. This includes working to optimize performance and cost automate operations and identifying and resolving production errors and issues to ensure the best data platform experience.


  • 3 years of professional software engineering experience with large-scale big data platforms including strong programming skills in Java Scala Python or Go.
  • Proven expertise in designing building and operating large-scale distributed data processing systems with a strong focus on Apache Spark.
  • Hands-on experience with table formats and data lake technologies such as Apache Iceberg ensuring scalability reliability and optimized query performance.
  • Skilled at coding for distributed systems and developing resilient data pipelines.
  • Strong background in incident management including troubleshooting root cause analysis and performance optimization in complex production environments.
  • Proficient with Unix/Linux systems and command-line tools for debugging and operational support.


  • Expertise in designing building and operating critical large-scale distributed systems with a focus on low latency fault-tolerance and high availability.
  • Experience with contribution to Open Source projects is a plus.
  • Experience with multiple public cloud infrastructure managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues.
  • Experience with workflow and data pipeline orchestration tools (e.g. Airflow DBT).
  • Understanding of data modeling and data warehousing concepts.
  • Familiarity with the AI/ML stack including GPUs MLFlow or Large Language Models (LLMs).
  • A learning attitude to continuously improve the self team and the organization.
  • Solid understanding of software engineering best practices including the full development lifecycle secure coding and experience building reusable frameworks or libraries.
As part of our team you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications such as analytics reporting and AI/ML apps. This includes working to optimize performance and cost automate operations and identifying and ...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Ask Siri to name the most successful company in the world and it might respond: Apple. And it's not just out of familial pride. Apple consistently ranks highly in profit, revenue, market capitalization, and consumer cachet. In 2018, the company became the first reach a trillion dollar ... View more

View Profile View Profile