SRE Engineer with GCP & Java
Woonsocket, RI - USA
Job Summary
Core Skills & Technologies
SRE & Production Support
Production application support with a strong focus on:
Availability
Reliability
Performance
Observability & Monitoring
Grafana
Prometheus
ELK Stack
Splunk
AppDynamics
Application & Platform Support
Java and Spring Boot
Support and troubleshooting of microservices
Google Cloud Platform (GCP) including:
BigQuery
Cloud Spanner
Monitoring and Logging services
Database & Backend
MS SQL query writing and performance tuning
PL/SQL stored procedures and batch job support
Operations & ITSM
Strong experience with ITIL processes including:
Incident Management
Problem Management
Change Management
Perform Root Cause Analysis (RCA) and lead post-incident reviews
Provide production release support and change validation
Key Responsibilities
Monitor applications to ensure system reliability and stability.
Handle production incidents escalations and service restorations.
Improve alert quality and reduce alert noise.
Support application services and batch jobs in a 24x7 shift model.
Support production release activities and change validations.
Automation & Innovation
Use automation and Generative AI to reduce manual operational effort.
Build scripts and dashboards for:
Monitoring
Alert analysis
Performance and operational reporting
Additional Skills & Expectations
Strong communication skills for collaboration with onshore and offshore teams.
Ability to work effectively in fast-paced high-pressure environments.
Willingness to work weekends and on-call shifts as required.
Skills Classification / Keywords
Digital: Google Cloud
Digital: Spring Boot
Digital: Site Reliability Engineering (SRE)
Core Java
Skills: Digital : Google Cloud Digital : Spring Boot Digital : Site Reliability Engineering (SRE) Core Java
Experience Required: 4-6
SRE & Production Support
Production application support with a strong focus on:
Availability
Reliability
Performance
Observability & Monitoring
Grafana
Prometheus
ELK Stack
Splunk
AppDynamics
Application & Platform Support
Java and Spring Boot
Support and troubleshooting of microservices
Google Cloud Platform (GCP) including:
BigQuery
Cloud Spanner
Monitoring and Logging services
Database & Backend
MS SQL query writing and performance tuning
PL/SQL stored procedures and batch job support
Operations & ITSM
Strong experience with ITIL processes including:
Incident Management
Problem Management
Change Management
Perform Root Cause Analysis (RCA) and lead post-incident reviews
Provide production release support and change validation
Key Responsibilities
Monitor applications to ensure system reliability and stability.
Handle production incidents escalations and service restorations.
Improve alert quality and reduce alert noise.
Support application services and batch jobs in a 24x7 shift model.
Support production release activities and change validations.
Automation & Innovation
Use automation and Generative AI to reduce manual operational effort.
Build scripts and dashboards for:
Monitoring
Alert analysis
Performance and operational reporting
Additional Skills & Expectations
Strong communication skills for collaboration with onshore and offshore teams.
Ability to work effectively in fast-paced high-pressure environments.
Willingness to work weekends and on-call shifts as required.
Skills Classification / Keywords
Digital: Google Cloud
Digital: Spring Boot
Digital: Site Reliability Engineering (SRE)
Core Java
Skills: Digital : Google Cloud Digital : Spring Boot Digital : Site Reliability Engineering (SRE) Core Java
Experience Required: 4-6