Employer Active
1 - 6 years
Anywhere (Remote)
Not Disclosed
Salary Not Disclosed
Any Nationality
N/A
1 Vacancy
Job Responsibilities: • Innovative monitoring, metrics, and log analytics to enable continuous improvement in system visibility and application performance • Identification of critical problems' root causes across the platform, incident reports, and communication • Sustain, monitor, and contribute to the improvement of the performance and availability of the 24x7 production environment, which includes networks, servers, databases, and so on • Supporting and troubleshooting the production and QA environments • To resolve customer requests/tickets, adhere to established internal workflows • Participate in an on-call rotation and offer hands-on assistance during emergencies, outages, and service transitions • Contribute to the development of long-term and short-term strategies for scaling the production environment • Follow a comprehensive incident management program that includes problem resolution • Create key performance indicators (KPIs) for service availability, uptime, and adherence to SOPs and SLAs Job Requirements: • Bachelor’s/Master’s degree in Engineering, Computer Science (or equivalent experience) • At least 3+ years of relevant experience as a software engineer • 2+ years of experience with Linux server administration • Experience with Amazon Web Services (EC2, ECS, S3, etc.) • Basic scripting skills- Bash and Python • Technical knowledge of several monitoring and analytics tools such as AWS CloudWatch, Sumologic, and Datadog • Experience with Docker, Vagrant, Ansible, Chef, and Puppet is nice to have • Experience in maintaining a secure production environment • Proficiency in English
Remote
Accounting & Auditing
Software Development / Application Development (IT Software)