Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailJoin Oracle Cloud Infrastructure (OCI) as a Principal Member of Technical Staff and play a pivotal role in shaping the future of cloud this role you will lead the design development and operation of compute operability solutions ensuring the reliability scalability and performance of OCIs compute infrastructure. Youll work with a team of innovative engineers to build and operate massive-scale integrated cloud services that power businesses and organizations worldwide.
As a key member of the OCI Compute team you will focus on enhancing the operability of our compute services driving automation and optimizing system reliability. Your work will directly impact the performance of mission-critical workloads for Oracles global customers solving complex challenges in distributed systems high-availability computing and operational excellence.
Responsibilities
Design and implement scalable reliable and high-performance compute operability solutions for OCI.
Develop tools frameworks and automation to enhance the operational efficiency of compute infrastructure.
Collaborate with cross-functional teams to define and deliver operability improvements including monitoring incident response and capacity planning.
Troubleshoot and resolve complex technical issues in large-scale distributed systems.
Drive the adoption of best practices for system reliability performance tuning and operational excellence.
Mentor junior engineers and contribute to the technical strategy for compute services.
Innovate to improve system availability reduce latency and optimize resource utilization.
Participate in on-call rotations to ensure 24/7 service reliability.
Qualifications
Bachelors or Masters degree in Computer Science Engineering or a related field or equivalent experience.
7 years of experience in software engineering with at least 3 years focused on cloud infrastructure or distributed systems.
Deep expertise in compute operability including virtualization containerization or orchestration technologies (e.g. KVM Docker Kubernetes).
Strong programming skills in languages such as Go Python Java or C.
Strong data analysis experience and proficiency in SQL
Proven experience with large-scale system design automation and operational tools (e.g. Grafana Terraform Prometheus).
Familiarity with cloud computing concepts including IaaS PaaS or serverless architectures.
Excellent problem-solving skills and a track record of resolving complex technical challenges.
Strong communication and collaboration skills to work effectively in a globally distributed team.
Preferred Qualifications
Experience with OCI AWS Azure or Google Cloud Platform.
Contributions to open-source projects or a strong portfolio of technical innovation.
Experience with observability tools (e.g. Grafana ELK stack) and incident management processes.
Background in building automation for zero-downtime deployments or self-healing systems.
Career Level - IC4
Required Experience:
Staff IC
Full-Time