SRE- Database ManagementAI

Oracle

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 23 hours ago
Vacancies: 1 Vacancy

Job Summary

Description

Key Responsibilities

  • Operate and optimize Oracle Database and Exadata environments to meet stringent availability performance and scalability targets in 24x7 production.

  • Lead database reliability engineering initiatives including HA design patterns capacity planning demand forecasting and performance analysis/system tuning.

  • Deliver advanced performance tuning (SQL optimization indexing strategies configuration and storage tuning) and drive measurable improvements in latency throughput and stability.

  • Design and maintain backup recovery and disaster recovery strategies; validate restore procedures and ensure readiness for mission-critical environments.

  • Apply SRE best practices including defining SLIs/SLOs managing error budgets and improving incident response through post-incident reviews and durable corrective actions.

  • Build automation and tools (Python/Shell/PowerShell) to eliminate toil reduce MTTR improve deployment reliability and prevent recurring incidents.

  • Instrument and enhance observability using monitoring/APM stacks (e.g. Prometheus Grafana APM) to improve signal quality and reduce alert noise.

  • Partner with engineering and architecture teams on service and database design data modeling decisions and system architecture improvements for distributed systems.



Responsibilities

Qualifications & Skills

Mandatory

  • Education: Bachelors or Masters degree in Computer Science Engineering or related field (or equivalent practical experience).

  • Experience: 6 years in SRE Cloud Engineering DevOps Database Reliability or similar production-operations engineering roles.

  • Oracle Database expertise: Expert hands-on experience with Oracle Database and Exadata administration high availability architectures and production operations.

  • Performance tuning: Demonstrated capability in SQL tuning indexing strategies resource utilization analysis and system tuning for high-scale workloads.

  • Backup/DR: Proven experience designing and operating backup recovery and disaster recovery solutions for 24x7 mission-critical systems.

  • Automation/scripting: Strong hands-on proficiency in Python and/or Shell/PowerShell for automation tooling and operational workflows.

  • Reliability & distributed systems: Solid understanding of cloud concepts distributed systems behaviors and SRE fundamentals (SLIs/SLOs incident response RCA).

  • Operational excellence: Strong troubleshooting analytical thinking and clear communication skills; comfortable acting as an escalation point during critical incidents.

  • Good-to-Have

  • Cloud platforms: OCI preferred; AWS/Azure/GCP experience also valuable.

  • IaC & configuration management: Terraform Ansible and Infrastructure-as-Code best practices.

  • Containers: Kubernetes and Docker exposure in production environments.

  • Observability depth: Experience with database observability APM tooling tracing and alert quality/noise reduction initiatives.

  • AI familiarity: Exposure to LLMs RAG or AI agents (especially in operational tooling/automation contexts).

  • Certifications: Oracle Database/Exadata OCI (or other cloud architect) SRE/DevOps-related certifications.

  • Self-Assessment Questions

  1. Have I owned production Oracle Database/Exadata environments and successfully improved availability or performance through concrete tuning or architecture changes

  2. Can I confidently diagnose performance issues end-to-end (SQL indexing configuration storage and workload characteristics) and explain tradeoffs to stakeholders

  3. Have I designed and validated backup/restore and DR processes (including regular testing) for systems that require 24x7 reliability

  4. Do I routinely build automation in Python/Shell/PowerShell to reduce manual operational work improve MTTR or prevent recurring incidents

  5. Am I comfortable applying SRE practices (SLIs/SLOs error budgets incident response RCA/postmortems) and driving improvements across teams



Qualifications

Career Level - IC3



DescriptionKey ResponsibilitiesOperate and optimize Oracle Database and Exadata environments to meet stringent availability performance and scalability targets in 24x7 production.Lead database reliability engineering initiatives including HA design patterns capacity planning demand forecasting and p...
View more view more

Key Skills

  • Abinitio
  • Lifting Equipment
  • Customer Service
  • Apache Commons
  • Business Management

About Company

Company Logo

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more

View Profile View Profile