In this strategic role youll develop frameworks and systems to monitor network performance reliability and security in environments spanning AWS Azure GCP and on-prem systems. Your mission: ensure deep real-time insight into traffic flows latency connectivity and cloud service interactions to enable fast detection diagnosis and resolution of network issues. You will partner with engineering security and operations teams to implement monitoring for VPCs VPNs Inter-Connects load balancers service meshes DNS and firewall traffic across cloud providers. Define telemetry standards and observability KPIs for cloud networking: latency jitter packet loss throughput and service reachability.
Expertise in designing and operationalizing large scale distributed fault-tolerant observability platform
Strong understanding of cloud-native networking components: VPCs peering transit gateways service meshes NATs firewalls etc
Deep knowledge of API design and interface technologies (JSON ProtoBuf REST RPC XML etc)
Strong systems programming skills including multi-threading concurrency caching batching
In depth knowledge of K8s OpenStack system virtualization build systems and infrastructure as code
10 years of proven experience in designing and operationalizing large scale distributed fault-tolerant observability platform
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.