Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailWe are seeking a motivated Site Reliability Engineer (SRE) to join our Observability team. In this role you will support the team in maintaining and improving the reliability security and performance of our systems. You will learn from experienced engineers while gaining hands-on experience with modern monitoring logging and automation tools.
As an SRE I you will assist in day-to-day operational tasks help monitor system health and participate in basic troubleshooting. You will also contribute to the maintenance of documentation and develop your technical skills through training and on-the-job experience.
Responsibilities
Assist in maintaining system security by applying hotfixes and operating system patches under guidance to protect against cybersecurity threats.
Support the deployment and configuration of monitoring and logging tools.
Help automate routine operational tasks to improve efficiency and support system integration.
Assist with the maintenance and basic management of observability tools such as Splunk ClickHouse Grafana Prometheus OpenTelemetry Fluent Bit ElasticSearch OpenSearch and CloudWatch.
Work with team members to help implement and maintain monitoring solutions in development staging and production environments.
Learn and apply DevOps and SRE best practices as directed by senior engineers.
Contribute to the setup and maintenance of CI CD pipelines to support automated build test and deployment processes.
Provide support in managing cloud infrastructure (AWS GCP) to help ensure availability and security.
Learn to use infrastructure as code tools such as Terraform Ansible or CloudFormation to support environment configuration.
Monitor system performance and assist in identifying and escalating issues for resolution.
Support the implementation and management of containerization technologies like Docker and Kubernetes.
Participate in basic troubleshooting and assist with root cause analysis for production incidents.
Help create and update documentation for infrastructure processes and operational procedures.
Provide first-level support for routine infrastructure and deployment issues escalating complex problems as needed.
Look for opportunities to automate repetitive tasks and suggest improvements to workflows.
Visas Observability ecosystem includes over 2000 platform nodes utilizing approximately 15 different tools for logging monitoring and tracing alongside 80000 client agents. The system handles daily log ingestion exceeding 100TB and oversees hundreds of critical applications supporting vital alerts dashboards and reports. To maintain this high level of performance and reliability we need a Site Reliability Engineer (SRE) with comprehensive knowledge and practical experience. This position requires an I4-level engineer who can operate independently with minimal supervision.
About Visas PRE Observability Team
Visas Product Reliability Engineering (PRE) Observability team partners with Product Development as well as Operations & Infrastructure teams to build and manage innovative reliable scalable secure and cost-effective observability platform solutions. We are looking for talented Senior Site Reliability Engineers to join our driven team with a focus on maximizing system availability performance security and reliability. This dynamic role requires technical leadership strong problem-solving skills and expertise in coding testing and debugging.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Qualifications :
Basic Qualifications:
Preferred Qualifications:
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time
Full-time