Hybrid Onsite - Irving TX (3 days/week) Site Reliability Engineer
Contract through end of 2025 will extend
Responsibilities
- Design and execute performance tests to evaluate responsiveness scalability and stability of applications.
- Conduct resiliency testing to validate fault tolerance and recovery strategies.
- Implement and monitor observability tools to track system health and detect issues in real time.
- Perform capacity planning and recommend scaling strategies for peak loads.
- Collaborate with developers and operations teams to optimize Java/Spring Boot microservices database queries and infrastructure configurations.
- Configure Kubernetes performance parameters (resource limits requests autoscaling policies).
- Implement resiliency patterns such as circuit breakers bulkheads retries rate limiters and fallback mechanisms.
- Document methodologies and provide training on performance and resiliency best practices.
- Continuously evaluate and improve testing and monitoring processes.
Required Technical Skills
- Programming: Strong experience with Java and Spring Boot for microservices.
- Containerization: Hands-on with Docker; experience deploying and tuning containerized applications.
- Scripting: Proficiency in Python and Bash for automation and test scripting.
- Cloud: Solid experience with Azure (mandatory); familiarity with cloud-native architectures.
- Observability/APM Tools: Splunk ELK stack AppDynamics (setup monitoring troubleshooting).
- Architecture & Resiliency: Knowledge of design patterns fault tolerance strategies and distributed systems.
- Microservices Support: Strong background in supporting and optimizing microservices applications.
- Computer Science Fundamentals: Algorithms data structures and architectural design best practices.
Preferred Skills
- Experience with Kubernetes (cluster configuration autoscaling resource tuning).
- Understanding of networking concepts (DNS load balancing firewalls VPNs).
- Exposure to CI/CD pipelines and DevOps practices.
Hybrid Onsite - Irving TX (3 days/week) Site Reliability Engineer Contract through end of 2025 will extend Responsibilities Design and execute performance tests to evaluate responsiveness scalability and stability of applications. Conduct resiliency testing to validate fault tolerance and recov...
Hybrid Onsite - Irving TX (3 days/week) Site Reliability Engineer
Contract through end of 2025 will extend
Responsibilities
- Design and execute performance tests to evaluate responsiveness scalability and stability of applications.
- Conduct resiliency testing to validate fault tolerance and recovery strategies.
- Implement and monitor observability tools to track system health and detect issues in real time.
- Perform capacity planning and recommend scaling strategies for peak loads.
- Collaborate with developers and operations teams to optimize Java/Spring Boot microservices database queries and infrastructure configurations.
- Configure Kubernetes performance parameters (resource limits requests autoscaling policies).
- Implement resiliency patterns such as circuit breakers bulkheads retries rate limiters and fallback mechanisms.
- Document methodologies and provide training on performance and resiliency best practices.
- Continuously evaluate and improve testing and monitoring processes.
Required Technical Skills
- Programming: Strong experience with Java and Spring Boot for microservices.
- Containerization: Hands-on with Docker; experience deploying and tuning containerized applications.
- Scripting: Proficiency in Python and Bash for automation and test scripting.
- Cloud: Solid experience with Azure (mandatory); familiarity with cloud-native architectures.
- Observability/APM Tools: Splunk ELK stack AppDynamics (setup monitoring troubleshooting).
- Architecture & Resiliency: Knowledge of design patterns fault tolerance strategies and distributed systems.
- Microservices Support: Strong background in supporting and optimizing microservices applications.
- Computer Science Fundamentals: Algorithms data structures and architectural design best practices.
Preferred Skills
- Experience with Kubernetes (cluster configuration autoscaling resource tuning).
- Understanding of networking concepts (DNS load balancing firewalls VPNs).
- Exposure to CI/CD pipelines and DevOps practices.
View more
View less