The mission of our Network Reliability Engineering team is to provide exceptional network reliability and automation services that enable our customers to drive operational excellence in
OCI networks at scale. By focusing on both reactive and proactive functions we aim to minimize downtime quickly resolve incidents and continuously enhance network performance through
automation advanced monitoring and a customer-centric approach.
What you will bring:
5 years of experience in networking technologies
Proficiency with Python or similar scripting language
Proficiency with network technologies and protocols (TCP/IP BGP OSPF MPLS)
Experience developing network automation or device management solutions
Excellent communication and organizational skills thriving in collaborative and agile teams
Ownership mindset - delivering results embracing ambiguity and driving continuous improvements.
Experience working in a network support role
Preferred Qualifications:
Proficiency with other network technologies and protocols including IS-IS RSVP-TE EVPN VxLAN DHCP DNS IPv4 and IPv6 etc.
Experience with network modeling and programming YAML YANG OpenConfig NETCONF
Experience with network monitoring and telemetry solutions.
Experience with Ticket systems like Jira and Version control systems like Git.
Knowledge of Scrum & Agile Methodologies
Supports the design deployment and operations of a large-scale global Oracle Infrastructure (OCI). Primarily focused on the development and support of network fabric and systems through a combination of a deep level understanding of networking at the protocol level coupled with programming skills. As OCI is a cloud-based network with a global footprint this support will include hundreds of thousands of network devices supporting millions of servers connected over a mix of dedicated backbone infrastructure CLOS Network and the Internet.
Participates in network solution and architecture design process.
Participate in operational support rotations as primary and secondary. Provide break-fix support for events. Serve as the escalation point for event remediation. Lead post-event root cause analysis.
Join major event/incident calls use technical and analytical skills to resolve network issues that impact Oracle customers/services
Fault handling and escalation - Identifying and responding to faults on OCIs systems and networks collaborating closely with 3rd party suppliers handling escalation through to resolution
Collaborate with program/project managers to develop milestones and deliverables.
Will primarily use existing procedures and tools to develop and safely execute network change. However may have to develop new procedures from time to time.
Develop solutions to enable front line support teams to act on network failure conditions.
Mentor junior engineers.
Coordinate with networking automation services for the development and integration of support tooling.
Coordinate with network monitoring to gather telemetry and create alerts rules using them.
Build dashboards to represent data at various network layers and device roles that help identify network issues anomalies.
Frequently develops scripts to automate routine tasks for team and business units.
Serves as SME on software development projects for network automation and network monitoring.
Collaborate with network vendor technical account team and internal Quality Assurance team to drive bug resolution and assist in the qualification of new firmware and/or operating systems
Career Level - IC3
Required Experience:
Senior IC
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more