The Opportunity
We are seeking a skilled and innovative Sr. Site Reliability Engineer to join our team and help solve complex challenges on a global scale. The Middleware Product Reliability Engineering (PRE) group is dedicated to ensuring our products and services operate with Always On availability exceptional reliability and outstanding performance.
This role is ideal for software engineers with AI & ML backgrounds who want to apply their skills to large-scale reliability engineering. Youll develop intelligent automation systems and integrate machine learning models into our middleware infrastructure working at the intersection of software development artificial intelligence and site reliability.
As a Visa Sr. Site Reliability Engineer you will be an integral part of a cross-functional team inventing designing building testing and operating software products that reach a truly global customer base. While building and supporting components of cutting-edge payment technology you will see your efforts shaping the digital future of monetary transactions.
What Youll Do
Support Middleware software and infrastructure components for all lines of business at Visa
Design and develop software solutions for middleware reliability using AI & ML techniques
Develop and improve Middleware monitoring and observability systems
Build intelligent automation systems leveraging machine learning models
Develop and maintain automation tools and integrations into existing AI & LLM frameworks to handle application support tasks
Collaborate with data science teams to integrate AI-driven insights into reliability engineering
Engage in production issue troubleshooting provide immediate service restoration follow up on root cause analysis and ensure permanent fixes are implemented
Coordinate and execute Middleware releases and production deployments
Optimize performance and tuning for Middleware applications
The Skills You Bring
Collaboration and Communication You possess strong interpersonal skills and excel at both written and verbal communication. You thrive in team environments and collaborate effectively with globally dispersed virtual teams.
Learning and Growth You are adaptable and eager to learn new technologies and tools. You enjoy sharing knowledge with others and contributing to collective team growth.
Innovation and Problem-Solving You are comfortable exploring beyond traditional solutions and embrace new technologies and innovative approaches. You excel at analytical thinking and creative problem-solving.
Decision-Making and Prioritization You effectively prioritize multitask and deliver quality work on time. You can make informed decisions on execution timelines and maintain focus in high-pressure situations.
Professional Development You take initiative in your work and demonstrate a strong sense of ownership. You are motivated to learn new technologies and business concepts to facilitate both personal and organizational growth.
Professional Ethics You demonstrate strong business ethics self-discipline and trustworthiness particularly when handling sensitive and confidential data in live production environments.
Technical Troubleshooting You possess strong analytical and problem-solving skills with the ability to swiftly identify and resolve complex technical issues. You excel at debugging performance tuning and root cause analysis. You are proactive in anticipating potential problems and implementing preventative measures to minimize disruptions.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Qualifications :
Basic Qualifications:
- 3 or more years of work experience with a Bachelors Degree or more than 2 years of work experience with an Advanced Degree (e.g. Masters MBA JD MD)
Core Skills:
- 3 years of experience with modern middleware technologies. These might include (Tomcat Apache Springboot SQS JBoss IBM MQ IBM DataPower Hazelcast Flink Connect Direct SSL)
- Understanding of Linux/Unix systems networking cloud platforms (AWS Azure GCP) containerization (Kubernetes Docker) and infrastructure-as-code tools (Terraform Ansible).
- Proficiency with monitoring tools (Prometheus Grafana Datadog etc.) logging systems (ELK stack Splunk) and tracing tools (Jaeger Zipkin).
- Proven track record of automating complex tasks and processes to improve efficiency and reliability using Python Go Java or similar.
Technical Areas Youll Grow In:
- Cloud & System Architecture: Design scalable resilient systems across hybrid cloud platforms (AWS GCP Azure)
- AI/ML Operations: Support and optimize ML model deployment pipelines and monitoring systems
- Observability & Performance: Master advanced monitoring tracing and performance optimization techniques
- Automation & Intelligence: Build smart alerting systems and automated remediation workflows
- Distributed Systems: Design and maintain globally distributed payment processing systems
What Makes You Thrive:
- Youre energized by solving complex problems
- You believe in automation over manual processes
- You enjoy mentoring others and sharing knowledge
- Youre comfortable with ambiguity and rapid change
- You value building reliable systems over quick fixes
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time
The OpportunityWe are seeking a skilled and innovative Sr. Site Reliability Engineer to join our team and help solve complex challenges on a global scale. The Middleware Product Reliability Engineering (PRE) group is dedicated to ensuring our products and services operate with Always On availability...
The Opportunity
We are seeking a skilled and innovative Sr. Site Reliability Engineer to join our team and help solve complex challenges on a global scale. The Middleware Product Reliability Engineering (PRE) group is dedicated to ensuring our products and services operate with Always On availability exceptional reliability and outstanding performance.
This role is ideal for software engineers with AI & ML backgrounds who want to apply their skills to large-scale reliability engineering. Youll develop intelligent automation systems and integrate machine learning models into our middleware infrastructure working at the intersection of software development artificial intelligence and site reliability.
As a Visa Sr. Site Reliability Engineer you will be an integral part of a cross-functional team inventing designing building testing and operating software products that reach a truly global customer base. While building and supporting components of cutting-edge payment technology you will see your efforts shaping the digital future of monetary transactions.
What Youll Do
Support Middleware software and infrastructure components for all lines of business at Visa
Design and develop software solutions for middleware reliability using AI & ML techniques
Develop and improve Middleware monitoring and observability systems
Build intelligent automation systems leveraging machine learning models
Develop and maintain automation tools and integrations into existing AI & LLM frameworks to handle application support tasks
Collaborate with data science teams to integrate AI-driven insights into reliability engineering
Engage in production issue troubleshooting provide immediate service restoration follow up on root cause analysis and ensure permanent fixes are implemented
Coordinate and execute Middleware releases and production deployments
Optimize performance and tuning for Middleware applications
The Skills You Bring
Collaboration and Communication You possess strong interpersonal skills and excel at both written and verbal communication. You thrive in team environments and collaborate effectively with globally dispersed virtual teams.
Learning and Growth You are adaptable and eager to learn new technologies and tools. You enjoy sharing knowledge with others and contributing to collective team growth.
Innovation and Problem-Solving You are comfortable exploring beyond traditional solutions and embrace new technologies and innovative approaches. You excel at analytical thinking and creative problem-solving.
Decision-Making and Prioritization You effectively prioritize multitask and deliver quality work on time. You can make informed decisions on execution timelines and maintain focus in high-pressure situations.
Professional Development You take initiative in your work and demonstrate a strong sense of ownership. You are motivated to learn new technologies and business concepts to facilitate both personal and organizational growth.
Professional Ethics You demonstrate strong business ethics self-discipline and trustworthiness particularly when handling sensitive and confidential data in live production environments.
Technical Troubleshooting You possess strong analytical and problem-solving skills with the ability to swiftly identify and resolve complex technical issues. You excel at debugging performance tuning and root cause analysis. You are proactive in anticipating potential problems and implementing preventative measures to minimize disruptions.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Qualifications :
Basic Qualifications:
- 3 or more years of work experience with a Bachelors Degree or more than 2 years of work experience with an Advanced Degree (e.g. Masters MBA JD MD)
Core Skills:
- 3 years of experience with modern middleware technologies. These might include (Tomcat Apache Springboot SQS JBoss IBM MQ IBM DataPower Hazelcast Flink Connect Direct SSL)
- Understanding of Linux/Unix systems networking cloud platforms (AWS Azure GCP) containerization (Kubernetes Docker) and infrastructure-as-code tools (Terraform Ansible).
- Proficiency with monitoring tools (Prometheus Grafana Datadog etc.) logging systems (ELK stack Splunk) and tracing tools (Jaeger Zipkin).
- Proven track record of automating complex tasks and processes to improve efficiency and reliability using Python Go Java or similar.
Technical Areas Youll Grow In:
- Cloud & System Architecture: Design scalable resilient systems across hybrid cloud platforms (AWS GCP Azure)
- AI/ML Operations: Support and optimize ML model deployment pipelines and monitoring systems
- Observability & Performance: Master advanced monitoring tracing and performance optimization techniques
- Automation & Intelligence: Build smart alerting systems and automated remediation workflows
- Distributed Systems: Design and maintain globally distributed payment processing systems
What Makes You Thrive:
- Youre energized by solving complex problems
- You believe in automation over manual processes
- You enjoy mentoring others and sharing knowledge
- Youre comfortable with ambiguity and rapid change
- You value building reliable systems over quick fixes
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time
View more
View less