Overview:
Join Visas Technology Organization a dynamic community of problem solvers and innovators dedicated to redefining the future of commerce. We manage one of the worlds most advanced processing networks handling over 65000 secure transactions per second across 80 million merchants 15000 financial institutions and billions of individuals. As a Lead Site Reliability Engineer (SRE) you will lead efforts to ensure stability security and efficiency of our applications and systems driving continuous improvement and innovation.
Key Responsibilities:
- Security and Safety: Ensure the security and safety of application services and platforms. Lead efforts to enhance operational practices focusing on efficiency security and excellence.
- Zero Downtime: Maintain zero downtime by swiftly addressing any issues to ensure the environment is always operational. Conduct rapid root cause analysis and implement remediation in production environments after thorough testing.
- Environment Management: Oversee all activities within the environment including deploying new code.
- Team Leadership: Inspire and lead the team to deliver strategic and innovative approaches that drive Visas growth. Provide mentorship and foster a culture of collaboration and continuous improvement.
- Stakeholder Partnerships: Build strong partnerships with key stakeholders including product management engineering design and operations.
- Strategic Impact: Impact strategic decisions at all levels by interacting with other leaders on complex issues and applying strong judgment and analysis.
- Effective Communication: Communicate effectively with both technical and business partners to create frameworks for discussing complex topics.
- Automation and AI: Regularly analyze the environment and promote the adoption of automation and Generative AI to stay competitive.
- Cloud Infrastructure: Lead cloud infrastructure adoption and migration ensuring a seamless transition with minimal downtime.
- Problem Resolution: Run problem bridges by collaborating with different functional and technical teams escalating issues as needed for timely resolution.
- Information Sharing: Proactively share important context and information with relevant stakeholders.
- Operational Excellence: Spearhead the enhancement of operational practices focusing on efficiency security and excellence.
This is a hybrid position. Expectations of days in office will be confirmed by your Hiring Manager.
Qualifications :
Basic Qualifications:
- 14 or more years of work experience with a Bachelors Degree or at least 12 years of work experience with an Advanced Degree (e.g. Masters/ MBA/JD/MD) or at least 10 years of work experience with a PhD
Education and Experience:
- 14 years of work experience in Site Reliability Engineering.
- 10 years of experience with JAVA J2EE applications and a deep understanding of Web Services technologies: REST & SOAP.
- 5 years of experience managing applications on Containers (Docker) and Cloud (AWS GCP Azure).
Technical Skills:
- Strong understanding of relational databases and middleware stacks (IIS .NET Java TcServer JBoss Containers).
- Knowledge of Generative AI capabilities and use cases.
- Advanced level programming and or scripting in 3 or more of the following: Python Java Go PowerShell JavaScript Terraform Ansible Helm Chef Cloud Formation
- Proficiency in CI CD tooling such as Jenkins Github Bitbucket ArgoCD Artifactory Bitbucket Azure DevOps in a large-scale environment Experience in OO design and design patterns.
- Proficiency in observability tooling such as Grafana Prometheus Splunk Datadog New Relic Dynatrace Sentry etc. in a large-scale environment
- Experience with Docker and Kubernetes.
- Experience with integrating third-party Web Services.
Leadership and Communication:
- 5 years of leading and building Site Reliability teams.
- Strong work ethic self-starter ability to work in a fast-paced team-oriented environment and comfortable working with a global team.
- Exceptional analytical and problem-solving skills along with strong oral and written communication abilities.
- Proven proficiency in troubleshooting root-cause analysis application design and implementing major components for large projects.
- Experience in creating tools to automate production support activities.
- Knowledge of monitoring tools and observability practices
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time