Role Summary:
- The Sr. Site Reliability Engineer will manage multiple applications under the Data Marketing Platform (DME). These applications operate across diverse platforms including and various backend databases. Application operates on Hybrid model Containers and VMs . The role is critical to ensuring the reliability availability and performance of these systems.
Key Responsibilities:
- Handle client requests and issue tickets. Coordinate major releases and hotfixes across Production and DR sites.
- Manage all types of security remediation and exception handling. Share knowledge and cross-train team members to build a scalable resource pool. Automate repetitive manual tasks to improve accuracy and efficiency.
- Identify application gaps and collaborate with Dev and Product teams for long-term fixes.
- Communicate proactively with business clients and partners regarding releases and incidents. Implement monitoring solutions for early detection and proactive mitigation.
- Continuously monitor applications and respond swiftly to minimize impact. Provide timely issue notifications and lead restoration efforts.
- Contribute to TLT-level initiatives and pursue continuous learning.
- Ensure high availability and SLA compliance across the platform.
- Support PRE turnover by validating and refining onboarding checklists.
- Reduce alert noise through deep application knowledge and incident analysis.
- Track application growth and manage environment scaling.
- Engage with business and clients to plan changes releases and onboarding.
- In an expanded role act as Associate System Analystdrive innovation automation and strategic decision-making.
- Some of the applications do have batch jobs scheduled either through Windows Scheduler or Control-M need make sure that the Cyclic and Hourly jobs are working fine as expected.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Qualifications :
Basic Qualifications:-
- Bachelors degree AND 4 years of relevant work experience
- API & Web: Java SpringBoot AngularJS Python Bash PowerShell
- Big Data: Kafka Hadoop Spark HBase
- Databases & Caching: Redis MySQL MSSQL DB2 Cassandra PostgresDB
- Messaging: IBM MQ AMQ
- Containers & Cloud: Docker Kubernetes Containers
- DevOps: Jenkins (CI/CD) GitHub Bitbucket
- AI Tools: GenAI MS Copilot
- Batch Operation Tools : Windows Scheduler Control-M
- Strong expertise in security remediation and exception handling.
- Demonstrated ability to automate and optimize operational tasks.
- Excellent communication and stakeholder engagement skills.
- Proficiency in monitoring and incident response.
- Familiarity with the full technology stack listed above.
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time