What are the top 510 responsibilities for this position (Please be detailed as to what the candidate is expected to do or complete on a daily basis)
- Performance Testing
- Design and execute performance tests to evaluate the systems responsiveness stability scalability and resource usage.
- Identify performance bottlenecks and provide recommendations for improvements.
- Analyze test results and generate detailed performance reports.
- Resiliency Testing
- Conduct resiliency tests to ensure the system can handle failures and recover gracefully.
- o Implement and test failure scenarios to validate the systems fault tolerance.
- Recommend and validate resiliency patterns such as circuit breakers bulkheads and retries.
- Performance Monitoring
- Set up and maintain performance monitoring tools to continuously track system performance.
- Analyze performance metrics and logs to detect and diagnose performance issues in realtime.
- Capacity Planning
- Perform capacity planning to ensure the system can handle expected and peak loads.
- Provide recommendations for scaling resources based on performance data and future growth projections.
- Performance Optimization
- Collaborate with development and operations teams to optimize code database queries and infrastructure configurations.
- Recommend best practices for performance tuning and optimization.
- Kubernetes Performance Parameters
- Recommend and configure performance parameters for Kubernetes clusters such as resource limits requests and autoscaling policies.
- Ensure optimal performance of containerized applications running in Kubernetes environments.
- Resiliency Patterns
- Recommend and implement resiliency patterns like circuit breakers rate limiters and fallback mechanisms to enhance system reliability.
- Validate the effectiveness of these patterns through testing and monitoring.
- Documentation and Training
- Document performance testing methodologies tools and best practices.
- Provide training and support to development and operations teams on performance and resiliency best practices.
- Continuous Improvement
- Continuously evaluate and improve performance testing and monitoring processes.
- Stay updated with the latest performance engineering tools techniques and industry trends.
What skills/technologies are required (please include the number of years of experience required)
- Experience with containerization technologies like Docker.
- String scripting skills in languages such as Bash Python.
- Effective problemsolving and analytical skills
- Must be familiar with observability and APM tools like Splunk ELK AppDynamics etc
- Good understanding of Architecture patterns and resiliency.
- Programing experience in Java and Spring boot
- Strong microservices application support experience.
- Proficient understanding of algorithms data structures architectural design patterns and best practices.
- Experience with Cloud is required
-
What skills/attributes are preferred (these are a desired not required)
- Experience working applications using Kubernetes platform is preferred
- Understanding of networking concepts including DNS load balancing firewalls and VPNs.
-
What does the interview process look like
- How many rounds 3
- Video phone or in person Video and onsite (if local)
How technical will the interviews be It will oral and code walk through