JOB SUMMARY
The Senior Network Observability Engineer Network Reliability Engineering (NRE) is the subject matter expert in designing and implementing the Network Observability strategy and platforms for the next-gen operations and engineering for all Marriott International (MI) networks including the Property Networks Datacenter Corporate and Client Networks and multi-cloud environments into a proactive telemetry-driven ecosystem. This role will work closely and collaboratively with a matrix team of expert network architects and engineers to drive adoption of SRE practices and operating models across all network product towers and around globe.
As the Senior Network Observability Engineer candidate will be a technical authority in network observability capable of architecting solutions that leverage AI/ML-driven insights real-time telemetry and automation frameworks to predict prevent and resolve network issues before they impact business operations. This position requires deep expertise in NetScout NetBrain Bigpanda ThousandEyes and Network advanced observability platforms as well as the ability to integrate these tools into Marriotts SRE operating model.
The engineer will collaborate with network architects product owners and global operations teams to define and enforce observability standards build automation pipelines and deliver actionable intelligence across thousands of properties worldwide. This role demands a visionary mindset to overcome the limitations of traditional monitoring and implement granular instrumentation distributed tracing and anomaly detection at scale.
CANDIDATE PROFILE
Required Education and Experience
Undergraduate degree in computer science network engineering or related discipline and/or equivalent experience/certification
7 years of progressive experience in network observability telemetry engineering and performance optimization for large-scale mission-critical environments
Proven expertise in collecting processing and correlating telemetry data (NetFlow IPFIX SNMP streaming telemetry) to enable predictive analytics and proactive incident prevention.
Hands-on experience with enterprise-grade observability Saas and Security platforms including NetScout NetBrain ThousandEyes BigPanda and other AI/ML-driven monitoring solutions.
Demonstrated ability to install configure and optimize observability tools integrate APIs and build automation workflows for anomaly detection and remediation.
Strong proficiency in administration of network tools and policy enforcement including role-based access control and compliance frameworks.
Expertise in developing observability requirements architecture designs and implementation roadmaps ensuring alignment with SRE principles and Agile delivery models.
Deep understanding of foundational networking protocols and technologies (ARP TCP/IP UDP DHCP DNS NAT) and advanced routing protocols (OSPF BGP).
Hands on experience with Palo Prisma and SDWAN Strata Cloud Manager Including routing and switching platforms (Cisco Juniper HP/Aruba).
Demonstrated experience in delivering written documents including detailed network solutions and architecture diagrams.
Experience with one or more Cloud Computing platforms (Amazon AWS Microsoft Azure Google Cloud Platform).
Experience in Agile and DevOps practices including sprint planning backlog grooming and embedding observability into CI/CD pipelines.
Ability to design custom dashboards KPIs and alerting strategies for real-time visibility and executive reporting.
Preferred:
Advanced Degree (e.g. MS PhD) in Computer Science or other technical discipline or MBA preferably with a focus on technology
Advanced certifications (CCNP AWS and Azure Networking Specialty) strongly preferred.
Experience managing network observability tools in hospitality or global enterprise environments.
Proficiency in leveraging public APIs for automation and integration with observability platforms.
Strong ability to collaborate across cross-functional teams in multiple time zones driving alignment and execution.
Demonstrated experience in researching emerging technologies standards and trends and translating them into actionable roadmaps.
Deep knowledge of next-generation observability tools and frameworks including NetScout NetBrain ThousandEyes and AI Ops platforms.
Proven ability to design and implement automation for network instrumentation and monitoring using scripting languages (Python REST APIs).
Excellent problem-solving skills capable of working independently and leading outcomes for distributed teams.
Strong understanding of change management testing methodologies and high-availability strategies for critical platforms.
Ability to manage multiple priorities effectively with exceptional attention to detail.
Track record of driving transformation in network technologies and observability practices through data-driven continuous improvement.
Experience improving reliability performance and agility of complex enterprise networks.
Expertise in network infrastructure automation instrumentation and emerging observability technologies.
Strong influencing and leadership skills to overcome barriers and drive organizational change.
Exceptional verbal and written communication skills including executive-level presentations and technical documentation.
CORE WORK ACTIVITIES
Develop complex global distributed infrastructure monitoring management and automation solutions to manage our global network.
Lead design write and build tools to improve the reliability availability and scalability of Datacenter/Cloud Networks Property Networks and Corporate Networks
Serve as technical lead for the development of complex global distributed infrastructure monitoring management and automation solutions to manage our global network.
Serve as technical lead for the design new tools to monitor and smart alerts that help discover failures or issues before our customers.
Collaborate with other Network teams to develop network SRE solutions with a focus on production integration
Conduct network analysis configuration management and develop improvements for system software performance availability and reliability
Provide program management assistance and contribute input to help manage project schedules risks and costs.
Manage Network SRE products and solutions including the design low level engineering and delivery of new hardware systems for Marriott applications across the network.
Define and implement an operational Recovery Time Objective (RTO) and Recovery Point Objective (RPO) strategy for all Network Infrastructure areas.
Establish management level relationships and partnering with all Business disciplines and other MI teams to define Network SRE services meet service level requirements and serve as an escalation point to resolve service delivery and operational issues.
Develop document and manage the requirements gathering process and provide detailed design and business processes to support the requirements throughout the project life cycle
Drive accountability with strategic sourcing partners vendors telco/ISPs etc. launching and managing Performance Improvement initiatives where appropriate.
Create functional strategies and specific objectives for the sub-function and contributes to development of budgets/policies/procedures to support the functional Network SRE tools systems and infrastructure.
Perform network troubleshooting and upgrades. Coordinate with local teams and vendors solve problems and restore services as needed
Foster an environment of continuous improvement and structured processes and procedures that support a zero-fault culture.
Maintaining Goals
Submits reports in a timely manner ensuring delivery deadlines are met.
Promotes the documenting of project progress accurately.
Provides input and assistance to other teams regarding projects.
Demonstrating and Applying Discipline Knowledge
Provides technical expertise and support to persons inside and outside of the department.
Demonstrates knowledge of job-relevant issues products systems and processes.
Demonstrates knowledge of function-specific procedures.
Keeps up-to-date technically and applies new knowledge to job.
Uses computers and computer systems (including hardware and software) to enter data and/ or process information.
Delivering on the Needs of Key Stakeholders
Understands and meets the needs of key stakeholders.
Develops specific goals and plans to prioritize organize and accomplish work.
Determines priorities schedules plans and necessary resources to ensure completion of any projects on schedule.
Collaborates with internal partners and stakeholders to support business/initiative strategies
Communicates concepts in a clear and persuasive manner that is easy to understand.
Generates and provides accurate and timely results in the form of reports presentations etc.
Demonstrates an understanding of business priorities
At Marriott International we are dedicated to being an equal opportunity employer welcoming all and providing access to opportunity. We actively foster an environment where the unique backgrounds of our associates are valued and greatest strength lies in the rich blend of culture talent and experiences of our are committed to non-discrimination on any protected basis including disability veteran status or other basis protected by applicable law.
Required Experience:
Senior IC
At Le Méridien, we are inspired by the era of glamorous travel, celebrating each culture through the distinctly European spirit of savouring the good life. Our guests are curious and creative, cosmopolitan culture seekers that appreciate moments of connection and slowing down to savou ... View more