Lead Database Reliability Engineer
Job Summary
About the Role:
The Database Reliability Engineer (DBRE) which is an extension or subset of the SRE (Site
Reliability Engineering) model just specializing in database technologies but with the same
underlying DevOps principle will be a lead strategic partner in building and maintaining a
Database as a Service Platform to help software engineers build deploy and monitor
applications with an emphasis on automation. This is an engineering discipline that combines
software and systems engineering to build and run large-scale massively distributed fault-
tolerant systems.
DBRE is responsible for the availability and reliability of our most critical database platform
services and ensures they meet our internal and external users requirements. The hosting
platforms will be on-prem servers as well as public clouds such as AWS/Azure.
How you will make an impact:
Drive technology initiatives by taking the lead and providing guidance to team members.
Design build and maintain enterprise-scale production relational backends using Microsoft SQL Server MySQL or Oracle (both on-premises and in the cloud with a particular emphasis on Relational Database Service in Amazon Web Services)
Be involved in designing building maintaining and monitoring CI/CD pipelines and all deployments up to production.
Handle performance tuning backup and recovery tasks.
Create automated processes for recurring database tasks and deployments (such as migrations replication restoring backups and spinning up new clusters).
Develop and automate best practices and repeatable procedures for deploying and
scaling databases.
Provide production and lower-environment support for assigned applications related to
their back-end databases
Build and maintain High Availability (HA) and Disaster Recovery (DR) design/implementation for complex mission-critical environments.
Assist with the design and implementation of infrastructure assets using cloud services.
Identify improvement opportunities on existing systems build plans and execute improvements.
Research of automation-related technologies.
Diagnose and troubleshoot database errors including participating in an on-call rotation
and being available for on-call support as needed (even working over weekends when
required).
We are looking for people who:
Have 5 years of experience either in PowerShell/ Windows command line scripting or Linux scripting such as bash especially with troubleshooting production systems.
Have 5 years of experience in building configuring and managing database environments.
Experience with at least two relational and non-relational databases such as Microsoft
SQL Server MySQL Oracle PostgreSQL MongoDB and CouchDB is expected.
Experience in analyzing requirements and proposing database solutions.
Hands on experience in building managing and troubleshooting high availability
features such as Clustering Log-shipping and Mirroring.
Have 2-4 years of experience using cloud database services such as Amazon RDS.
Have experience in DEV-OPS configuration management system automation using tools
such as Terraform Ansible CloudFormation Chef etc.
Have hands-on experience with Continuous Integration/Continuous Delivery & Deployment techniques and tools such as Jenkins and GitHub.
Have exposure to containerization (Docker) and a container orchestration system
(ECS/Kubernetes).
Have good understandings on disciplines related to database reliability engineering such
as systems management security and release management.
Have experience in managing projects and initiatives with minimum supervision.
Have effective communication skills - both verbally and in writing.
Can document the processes and procedures involved.
Required Experience:
IC
About Company
At Virtusa, we are builders, makers, and doers. Digital engineering is in our DNA. It’s at the heart of everything we do.