Site Reliability Engineering (SRE)

Programmers.io

Not Interested
Bookmark
Report This Job

profile Job Location:

Marlborough, NH - USA

profile Monthly Salary: Not Disclosed
Posted on: 12 hours ago
Vacancies: 1 Vacancy

Job Summary

Key Responsibilities
OMS Reliability: Maintain monitor and improve the performance of Sterling OMS or similar order management platforms to meet strict SLAs.
Automation & Scripting: Develop scripts and automation tools to reduce operational toil automate deployments and streamline incident responses.
Incident Management: Lead root cause analysis (RCA) and post-mortems for production incidents applying fixes to prevent recurrence.
Monitoring & Observability: Implement proactive monitoring logging and tracing to gain insights into system health and user experience.
System Optimization: Conduct capacity planning and performance tuning to ensure the system handles peak order volumes efficiently.
Collaboration: Work with development teams to ensure new features are reliable and deployable.
Required Skills & Qualifications
Technical Expertise: Strong experience with Order Management Systems (e.g. Sterling OMS).
Programming/Scripting: Proficiency in languages such as Java Python or shell scripting.
Infrastructure & Tools: Familiarity with Linux cloud platforms (AWS/Azure/GCP) containerization (Kubernetes/Docker) and CI/CD tools.
Operations Mindset: Experience in IT operations supporting large-scale distributed software applications.
Communication: Strong ability to communicate with both technical and non-technical teams including offshore coordination.

Key Responsibilities OMS Reliability: Maintain monitor and improve the performance of Sterling OMS or similar order management platforms to meet strict SLAs. Automation & Scripting: Develop scripts and automation tools to reduce operational toil automate deployments and streamline incident r...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting