Principal Site Reliability Developer

Oracle

Posted on : 22-08-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Zapopan - Mexico

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 22-08-2025

Job Description

Description

As a senior member of the Site Reliability Engineering (SRE) team youll take ownership of highly available systems influence service design and work across teams to drive resiliency automation and operational excellence. This is a hands-on engineering role where deep infrastructure knowledge meets software engineering expertise ideal for experienced SREs ready to take the lead.

Responsibilities

What Youll Do:

Lead the design automation and support of OCI services with a focus on resiliency security scalability and performance.
Own and improve the end-to-end reliability metrics (SLOs SLAs KPIs) for your services.
Design and implement high-availability architectures and standards for large-scale distributed systems.
Serve as the ultimate escalation point for complex operational issues using a deep understanding of service topologies and interdependencies.
Architect and build automation and orchestration tools that reduce manual work and prevent problem recurrence.
Collaborate with development teams to improve service designs optimize deployments and implement best practices for operational efficiency.
Guide technical decision-making and mentor junior SREs and developers across teams.
Participate in and lead postmortems root cause analysis and preventative design changes.
Contribute to capacity planning demand forecasting and long-term service scalability strategies.
Participate in a rotational on-call schedule to ensure the health and availability of production services.

What Were Looking For:

Advanced experience with Linux systems administration
Strong programming skills in Python (with automation libraries)
Advanced Bash/Shell scripting
Deep understanding of distributed systems networking and service architecture
Solid knowledge of databases and how they behave in production (SQL or NoSQL)
Strong understanding of CI/CD pipelines Agile methodologies and DevOps best practices
Experience writing and maintaining unit tests and production-grade software
Proven ability to lead cross-functional efforts and technical problem-solving in live environments

Nice to Have:

Hands-on experience with monitoring and observability tools (Grafana Prometheus New Relic etc.)
Familiarity with Oracle Cloud Infrastructure (OCI) or other cloud platforms (AWS Azure GCP)
Experience with Infrastructure-as-Code (Terraform Ansible) and container orchestration (Kubernetes)

Qualifications

Career Level - IC4

Required Experience:

Staff IC

Employment Type

Full-Time

Company Industry

Key Skills

Apply Now

About Company

Oracle

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Principal Site Reliability Developer

Oracle

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Web Developer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer I

Site Reliability Engineer - AWS

Senior Site Reliability Engineer

Senior Site Reliability Engineer