Networking Reliability Developer 3

Oracle

Not Interested
Bookmark
Report This Job

profile Job Location:

Singapore - Singapore

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Description

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.

As a Senior Network Reliability Engineer on the OCI Network Availability team you will play a crucial role in ensuring the high availability and performance of Oracle Clouds global network infrastructure. This role involves applying engineering methodologies to measure monitor and automate the reliability of OCIs network supporting millions of users across a vast distributed environment.

You will be part of a fast-paced innovative team responsible for swiftly responding to network disruptions identifying root causes and collaborating with both internal and external stakeholders to restore services. Your work will also focus on automating daily operations improving workflow efficiency and optimizing network performance. With OCIs expansive global footprint you will manage hundreds of thousands of network devices across a mix of dedicated backbone infrastructure CLoS networks and the internet.



Responsibilities

Support and Operate OCIs Global Network: Design deploy and manage large-scale network solutions that power Oracle Cloud Infrastructure (OCI) ensuring reliability and performance at a global scale.

Collaborate and Drive Change: Use best practices and tools to develop and execute network changes safely. Work closely with cross-functional teams to continuously improve network performance.

Incident Response and Troubleshooting: Lead break-fix support for network events provide escalation for complex issues and perform post-event root cause analysis to prevent future disruptions.

Automation and Efficiency: Create and maintain scripts to automate routine network tasks working with business units and teams to streamline operations and increase productivity.

Mentorship and Knowledge Sharing: Guide and mentor junior engineers fostering a culture of collaboration continuous learning and technical excellence.

Network Monitoring and Performance Analysis: Collaborate with network monitoring teams to gather telemetry data build dashboards and set up alert rules to track network health and performance.

Vendor Collaboration: Work with network vendors and technical account teams to resolve network issues qualify new firmware/operating systems and ensure the network ecosystems stability.

On-Call Support: Participate in the on-call rotation to provide after-hours support for critical network events ensuring that operational excellence is maintained 24/7.

Experience:

Experience working in a large-scale ISP or cloud provider environment supporting global network infrastructure.

Prior experience in a network operations role with a proven track record of handling complex network events.

Technical Skills:

Strong proficiency in network protocols and services including MPLS BGP OSPF IS-IS TCP/IP IPv4/IPv6 DNS DHCP VxLAN and EVPN.

Extensive experience with network automation scripting and data center design. Python is preferred though expertise in other scripting or compiled languages is a plus.

Hands-on experience with network monitoring and telemetry solutions with the ability to leverage these tools to drive improvements in network reliability.

Familiarity with network modeling and programming including YANG OpenConfig and NETCONF.

Problem-Solving and Collaboration:

Ability to apply engineering principles to resolve complex network issues collaborating across teams to deliver effective solutions.

Strong communication skills both written and verbal with the ability to present technical information clearly to both technical and non-technical stakeholders.

Demonstrated experience in influencing product roadmap decisions priorities and feature development through sound judgment and technical expertise.



Qualifications

Career Level - IC3



DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.As a Senior Network Reliability Engineer on the OCI Network Availability team you will play a crucial role in ensuring the high availability and performance of Oracle Clouds ...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more

View Profile View Profile