Senior Java Platform Reliability Engineer


Job Location:

Malvern, PA - USA

Monthly Salary: Not Disclosed
Posted on: 2 days ago
Vacancies: 1 Vacancy

Job Summary

#W2 Role
Job Title: Senior Java / Platform Reliability Engineer
Location: Malvern PA
Duration: Long-term Contract
Job Description
We are seeking a strong Senior 10 Years of Java / Platform Reliability Engineer with solid software engineering experience and a strong understanding of production reliability. This role is not a traditional operations-focused SRE position. The ideal candidate should be a hands-on backend engineer who can design build and support resilient scalable and fault-tolerant applications in a cloud environment.
The role will focus on backend platform engineering reliability improvements cloud integration observability automation and supporting production systems. The candidate should have strong experience in Java development along with working knowledge of Python AWS APIs microservices and cloud-native application support.
Key Responsibilities
Design develop and enhance backend services using Java and related backend technologies.
Build reliable scalable and fault-tolerant applications that can operate in production at scale.
Work closely with platform application and infrastructure teams to improve system reliability and performance.
Support production systems by identifying reliability gaps performance issues and areas for automation.
Develop and maintain microservices APIs and backend integrations.
Work with AWS cloud services to support application deployment monitoring and platform improvements.
Use Python or scripting where needed for automation tooling and reliability engineering tasks.
Contribute to incident analysis root cause reviews and long-term preventive solutions.
Improve observability through logging metrics tracing dashboards and alerting.
Collaborate with engineering teams to implement best practices around resiliency scalability and availability.
Participate in design discussions for backend systems cloud architecture and platform reliability.
Help modernize and improve existing applications with better monitoring automation and fault tolerance.
Required Skills
10 years of overall software engineering experience.
Strong hands-on experience with Java development.
Experience building and supporting backend services APIs and microservices.
Good experience with AWS/cloud technologies.
Working knowledge of Python for scripting automation or backend support.
Strong understanding of production reliability scalability and system performance.
Experience operating applications in production environments.
Knowledge of resilient application design fault tolerance retries timeouts failover and recovery patterns.
Experience with CI/CD pipelines and modern software delivery practices.
Strong troubleshooting and problem-solving skills.
Ability to work with cross-functional teams including development platform cloud and infrastructure teams.
Preferred Skills
Experience with telemetry and observability tools such as OpenTelemetry Splunk Datadog CloudWatch Grafana Prometheus or similar tools.
Experience with distributed tracing logging metrics and alerting.
Knowledge of container platforms such as Docker and Kubernetes.
Experience with infrastructure automation or platform engineering tools.
Exposure to event-driven architecture or messaging systems.
Prior experience in a Site Reliability Engineering Platform Engineering or Production Engineering role.
#W2 Role Job Title: Senior Java / Platform Reliability Engineer Location: Malvern PA Duration: Long-term Contract Job Description We are seeking a strong Senior 10 Years of Java / Platform Reliability Engineer with solid software engineering experience and a strong understanding of production ...