Linux Server Manager
Job Summary
We are seeking a highly qualified Server Manager to join our Enhanced Operation Service (EOS) team which is part of the Enterprise Cloud Services Delivery this role you will act as a trusted advisor responsible for safeguarding and optimizing end-to-end service delivery for strategic customers throughout their cloud transformation journey.
You will be part of a global project integrating into a team that operates in a 24x7 environment for a Tech Mahindra client. This is a remote position with a minimum expected assignment of one year. The initial working hours will be Monday to Friday from 9:00 AM to 6:00 PM; however the schedule may change throughout the project to meet operational demands.
Key Responsibilities:
Your role will involve a mixed workload with a strong focus on maintaining system stability and performance:
Incident and Problem Management: Actively participate in the resolution of critical incidents (Major Incidents) resolve service request failures and conduct root cause analysis (RCA) for outages or performance issues.
Service and Change Requests: Execute complex service and change requests managing extended downtime windows and long-running incidents.
Performance Optimization: Identify and lead proactive initiatives to improve system operation stability and standardization for customer environments.
Specialized Technical Support: Provide expert-level technical support in Linux and infrastructure with strong troubleshooting capabilities for disk server and network connectivity issues.
Continuous Improvement: Optimize Standard Operating Procedures (SOPs) through automation and define corrective action plans to achieve established KPIs.
Orchestration and Collaboration: Coordinate work across multiple internal and external cloud service units to ensure seamless service delivery.
Requirements
Core Technical Requirements:
Linux Expertise (Mandatory): Solid hands-on experience administering SUSE Red Hat or Ubuntu systems. Proven real-world experience in system disk and performance troubleshooting is essential.
Clustering & High Availability (Mandatory Flexible): 2 to 3 years of experience with High Availability (HA) configurations. Knowledge of Pacemaker is preferred but other clustering solutions (such as Red Hat HA) are also acceptable.
Cloud Proficiency (Mandatory): Practical experience with at least one major public cloud provider: AWS Azure or GCP. Experience in multi-cloud environments is considered a plus.
Networking: Advanced ability to diagnose network issues in Linux and cloud environments including TCP/IP DNS LDAP NAT firewalls and connectivity analysis.
Automation: Experience with scripting languages (Shell Python Go etc.) and server automation tools such as Ansible or CHEF.
Qualifications and Experience:
Professional Experience: Minimum of 8 to 10 years of experience in IT infrastructure and server operations.
Education: Bachelors degree in Computer Science Engineering IT Management or related fields.
Language: Fluent English is mandatory as all interactions with the global team documentation and customer support will be conducted in English.
Soft Skills: Strong customer focus analytical and solution-oriented mindset and the ability to independently and proactively acquire new knowledge.
Required Experience:
Manager