DescriptionOracle is looking for an experienced Senior Cloud Engineerto join our Cloud Engineering team in Brazil. This is an opportunity to be part of a high-impact team that drives reliability scalability and innovation in mission-critical environments.
As a Senior Cloud Engineer you will be responsible for designing and implementing modern cloud infrastructure solutions with a strong focus on observability automation scalability and performance. You will work collaboratively with SREs architects and product teams to support large-scale environments with cutting-edge technologies including Kubernetes Infrastructure as Code (IaC) and High Performance Computing (HPC).
This role requires deep expertise in Site Reliability Engineering cloud-native environments and modern DevOps practices. Experience with GPUs and AIOps solutions will be considered a strong plus.
Responsibilities- Design implement and support scalable and resilient cloud infrastructure on OCI and/or other cloud platforms
- Automate infrastructure provisioning and deployment using Infrastructure as Code (Terraform Ansible etc.)
- Collaborate with development and operations teams to ensure system reliability scalability and performance
- Manage Kubernetes clusters and containerized workloads in production environments
- Lead root cause analysis and resolution of service incidents and performance issues
- Contribute to SRE practices: monitoring incident management chaos engineering and SLAs/SLOs
- Support HPC workloads and GPU-powered infrastructure as needed
- Drive innovation through continuous improvement automation and implementation of AIOps strategies
- Document solutions and processes mentor junior engineers and lead technical discussions
Required Qualifications- Hands-on experience as a Site Reliability Engineer (SRE)or Cloud Engineer in production environments
- Proficiency in container orchestration with Kubernetes
- Strong experience with cloud infrastructure(OCI AWS Azure GCP)
- Experience with Infrastructure as Codetools (Terraform Ansible etc.)
- Familiarity with HPC architecturesand Linux performance tuning
- Knowledge of monitoring observability and incident management best practices
- Solid scripting skills (Shell Python or similar)
- Strong analytical problem-solving and communication skills
- Ability to work in a fast-paced collaborative and globally distributed team environment
Preferred Qualifications- Experience with GPUs for compute-intensive workloads
- Familiarity with AIOpsplatforms and practices
- OCI certifications (or equivalent cloud certifications)
- Experience with CI/CD pipelines and GitOps workflows
- Bachelors or Masters degree in Computer Science Engineering or related field
QualificationsCareer Level - IC4
DescriptionOracle is looking for an experienced Senior Cloud Engineerto join our Cloud Engineering team in Brazil. This is an opportunity to be part of a high-impact team that drives reliability scalability and innovation in mission-critical environments.As a Senior Cloud Engineer you will be respon...
DescriptionOracle is looking for an experienced Senior Cloud Engineerto join our Cloud Engineering team in Brazil. This is an opportunity to be part of a high-impact team that drives reliability scalability and innovation in mission-critical environments.
As a Senior Cloud Engineer you will be responsible for designing and implementing modern cloud infrastructure solutions with a strong focus on observability automation scalability and performance. You will work collaboratively with SREs architects and product teams to support large-scale environments with cutting-edge technologies including Kubernetes Infrastructure as Code (IaC) and High Performance Computing (HPC).
This role requires deep expertise in Site Reliability Engineering cloud-native environments and modern DevOps practices. Experience with GPUs and AIOps solutions will be considered a strong plus.
Responsibilities- Design implement and support scalable and resilient cloud infrastructure on OCI and/or other cloud platforms
- Automate infrastructure provisioning and deployment using Infrastructure as Code (Terraform Ansible etc.)
- Collaborate with development and operations teams to ensure system reliability scalability and performance
- Manage Kubernetes clusters and containerized workloads in production environments
- Lead root cause analysis and resolution of service incidents and performance issues
- Contribute to SRE practices: monitoring incident management chaos engineering and SLAs/SLOs
- Support HPC workloads and GPU-powered infrastructure as needed
- Drive innovation through continuous improvement automation and implementation of AIOps strategies
- Document solutions and processes mentor junior engineers and lead technical discussions
Required Qualifications- Hands-on experience as a Site Reliability Engineer (SRE)or Cloud Engineer in production environments
- Proficiency in container orchestration with Kubernetes
- Strong experience with cloud infrastructure(OCI AWS Azure GCP)
- Experience with Infrastructure as Codetools (Terraform Ansible etc.)
- Familiarity with HPC architecturesand Linux performance tuning
- Knowledge of monitoring observability and incident management best practices
- Solid scripting skills (Shell Python or similar)
- Strong analytical problem-solving and communication skills
- Ability to work in a fast-paced collaborative and globally distributed team environment
Preferred Qualifications- Experience with GPUs for compute-intensive workloads
- Familiarity with AIOpsplatforms and practices
- OCI certifications (or equivalent cloud certifications)
- Experience with CI/CD pipelines and GitOps workflows
- Bachelors or Masters degree in Computer Science Engineering or related field
QualificationsCareer Level - IC4
View more
View less