DescriptionWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build this role you will be responsible for building deploying and maintaining compute cloud infrastructure services across multiple regions to ensure high availability scalability and performance. You will work closely with engineering product and operations teams to design and implement robust automation and monitoring solutions and lead efforts to improve system reliability and efficiency.
ResponsibilitiesResponsibilities:
- Work with Site Reliability Engineering (SRE) team to build and maintain OCI compute cloud infrastructure and services across multiple geographic regions.
- Understand the end-to-end configuration technical dependencies and overall behavioral characteristics of Oracle Cloud Region Build services.
- Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
- Be part of incident response to help removing blockers during the region build process.
- Continuously improve compute cloud infrastructure region build.
- Participate in on-call rotations and provide support for critical infrastructure issues.
- Automate infrastructure provisioning configuration and deployment using tools like Terraform.
- Collaborate with cross-functional teams to design and roll out new cloud region builds and expansions.
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
- Collaborate with software engineers to build scalable reliable and highly available cloud-native systems.
- Monitor system health and performance using tools like Grafana.
- Document operational procedures and runbooks.
Required Qualifications:
- Bachelors degree in Computer Science Engineering or related technical field (or equivalent experience).
- Proven experience (3 years) as an SRE Cloud Engineer or DevOps Engineer in cloud environments.
- Strong knowledge of cloud platforms such as AWS GCP or Azure with hands-on experience in building and managing regional deployments.
- Expertise in Infrastructure as Code (Terraform CloudFormation Ansible etc.).
- Proficient with scripting languages (Python Bash Go etc.).
- Experience with monitoring alerting and logging tools (Prometheus Grafana ELK stack Datadog etc.).
- Solid understanding of networking security and distributed systems in cloud environments.
- Experience working in Agile teams and collaborating with software engineers and product teams.
- Strong troubleshooting and problem-solving skills.
- Excellent communication and documentation skills.
QualificationsCareer Level - IC3