SRE Engineer

Austin, TX - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Job Description - SRE Engineer

Location: Austin-Texas (3 days Hybrid)

Need Face to Face.

Job Summary

Seasoned Site Reliability Engineer (SRE) with 5 years of experience in supporting complex large-scale distributed systems. Highly skilled in managing production failures conducting root cause analysis and driving effective remediation. Strong communicator with expertise in ing monitoring and release management complemented by automation proficiency and a keen ability to learn quickly.

This role involves providing 24/7 support as part of the SRE team ensuring the reliability and performance of mission-critical and Batch applications deployed across GCP PCF and on-premise environments.

Years of experience needed

Candidate experience 5 Years

Technical Skills:

Expertise in understanding large scale production systems and technologies for example load balancing monitoring distributed systems microservices and configuration management.

Should have solid hands-on experience in troubleshooting and fixing application failures application Performance degradation Code issues cloud platform issues Batch Failures Infra failures DB failures Network failures.

Hands-on experience in performing Production deployments using CI/CD and exposure to deployment strategies.

Experience in troubleshooting of Linux/Unix.

Monitor the application/Services/batch availability.

Act quickly on the application s(Performance Availability) and Batch Job failures

Perform the required analysis (Code/Log) and escalate to the Engineering team as required.

Initiate and drive the Techlines in case of outages/major incidents/Batch abends and ensure Service Restoration in the least time possible.

Effectively handle the Incident Problem Release and Change management.

Own and deliver the user stories assigned as part of the sprint.

o The user stories range from application code Debugging Issue analysis Code fix Knowledge base creation documentation of SOPs Production Deployments Pre & Post Patching/Maintenance activities Service Requests.

o Build monitoring solutions using APM tools like Splunk Appdynamics Thousand Eyes ITRS AppMetrics MoogSoft Kafka etc.

o Automate of day-day operational tasks.

o Be part of the Exit reviews to ensure the best practices are followed to have the right code deployed to Production systems

o Provide feedback/recommend improvements to the system which would enable highly stable systems.

Strong understanding of Networking Concepts (TCP/IP SSL/TLS IPSec VPN etc) Firewall and Load Balancers.

Experience in Scripting Shell/Powershell/Python

Strong Experience in working with any Cloud-based infrastructure (PCF GCP AWS Azure Cloud or other.

Email:

Job Description - SRE Engineer Location: Austin-Texas (3 days Hybrid) Need Face to Face. Job Summary Seasoned Site Reliability Engineer (SRE) with 5 years of experience in supporting complex large-scale distributed systems. Highly skilled in managing production failures conducting root cause...

Job Description - SRE Engineer

Location: Austin-Texas (3 days Hybrid)

Need Face to Face.

Job Summary

Years of experience needed

Candidate experience 5 Years

Technical Skills:

Expertise in understanding large scale production systems and technologies for example load balancing monitoring distributed systems microservices and configuration management.

Hands-on experience in performing Production deployments using CI/CD and exposure to deployment strategies.

Experience in troubleshooting of Linux/Unix.

Monitor the application/Services/batch availability.

Act quickly on the application s(Performance Availability) and Batch Job failures

Perform the required analysis (Code/Log) and escalate to the Engineering team as required.

Initiate and drive the Techlines in case of outages/major incidents/Batch abends and ensure Service Restoration in the least time possible.

Effectively handle the Incident Problem Release and Change management.

Own and deliver the user stories assigned as part of the sprint.

o Build monitoring solutions using APM tools like Splunk Appdynamics Thousand Eyes ITRS AppMetrics MoogSoft Kafka etc.

o Automate of day-day operational tasks.

o Be part of the Exit reviews to ensure the best practices are followed to have the right code deployed to Production systems

o Provide feedback/recommend improvements to the system which would enable highly stable systems.

Strong understanding of Networking Concepts (TCP/IP SSL/TLS IPSec VPN etc) Firewall and Load Balancers.

Experience in Scripting Shell/Powershell/Python

Strong Experience in working with any Cloud-based infrastructure (PCF GCP AWS Azure Cloud or other.

Email:

Key Skills

ASP.NET
Health Education
Fashion Designing
Fiber
Investigation

Apply Now

About Company

DRC Systems

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

SRE Engineer

Austin, TX - USA

Job Summary

Key Skills

About Company

Related Jobs