As a Senior NOC Engineer you will play a vital role in ensuring the health stability and uptime of our production systems. This is a hands-on operational role requiring a deep understanding of system administration networking and incident response. Youll act as the first line of defense during outages and performance issues with responsibility for real-time monitoring troubleshooting and driving incident resolution in a 24/7 environment. If you enjoy working with infrastructure at scale and thrive in fast-paced environments this is the role for you.
Roles & Responsibilities
- Monitor production systems and applications to ensure consistent uptime performance and availability
- Respond to and manage incidents alerts and outages in real time coordinating appropriate responses
- Conduct root cause analysis (RCA) and implement corrective and preventive actions
- Troubleshoot system application and network issues escalated by monitoring systems or support teams
- Participate in 24/7 shift rotations including weekends and holidays to ensure continuous support
- Collaborate with engineering and product teams to improve observability and monitoring frameworks
- Develop and update SOPs runbooks and internal knowledge bases to ensure process consistency
- Maintain compliance with internal security audit and operational standards
- Recommend and implement automation and monitoring improvements to increase efficiency and reduce incident frequency
- Engage in post-incident reviews and help drive blameless postmortems and process improvement initiatives
Qualifications :
-
- 3 years of hands-on experience in Linux/Unix systems administration and network troubleshooting
- Solid grasp of internet and network protocols: DNS DHCP TCP/IP NTP SMTP VPNs HTTPS TLS IPSec
- Experience monitoring and managing applications like Apache Tomcat MySQL
- Proficient in scripting using Shell Python or Ruby for automation
- Experience with monitoring/logging tools such as Nagios Datadog New Relic ELK Splunk or Sumo Logic
- Familiarity with incident management platforms like PagerDuty JIRA or ServiceNow
- Basic knowledge of web technologies including HTML CSS JavaScript and backend fundamentals
- Experience with public cloud platforms (preferably AWS)
- Hands-on experience with Docker and Kubernetes
- Working knowledge of CI/CD pipelines and tools like Jenkins
- Familiarity with Infrastructure-as-Code using Terraform
- Excellent communication skills and ability to work with cross-functional teams including DevOps SRE and Security
Skills Inventory
- Production Monitoring: Real-time infrastructure and application monitoring for uptime and performance
- Incident Response: Timely identification escalation and resolution of production issues
- Root Cause Analysis: Investigation and documentation of service-impacting events
- Linux/Unix Administration: Deep expertise in managing server environments
- Networking Fundamentals: Strong understanding of protocols like DNS DHCP TCP/IP VPN
- Scripting & Automation: Writing scripts in Shell/Python/Ruby to automate tasks
- Monitoring & Logging Tools: Hands-on use of tools like Datadog ELK Nagios Splunk
- Cloud Infrastructure: Working with AWS or equivalent public cloud platforms
- Containers & Orchestration: Knowledge of Docker and Kubernetes
- CI/CD & DevOps: Familiarity with Jenkins and deployment pipelines
- Infrastructure as Code: Basic experience using Terraform
- Collaboration: Strong coordination with SRE Security and Engineering teams
- Compliance & Documentation: Creating SOPs playbooks and ensuring adherence to policies
-
Additional Information :
At Freshworks we are creating a global workplace that enables everyone to find their true potential purpose and passion irrespective of their background gender race sexual orientation religion and ethnicity. We are committed to providing equal opportunity for all and believe that diversity in the workplace creates a more vibrant richer work environment that advances the goals of our employees communities and the business.
Remote Work :
No
Employment Type :
Full-time