drjobs Site Reliability Engineer SRE

Site Reliability Engineer SRE

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Chennai - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Title: Site Reliability Engineer (SRE)
Experience: 6 to 9 years
Location: chennai
Job Overview:
We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our growing team. As an SRE you will be responsible for maintaining the reliability availability and performance of our systems. We re looking for someone with solid experience in monitoring scripting and dashboarding to ensure our services run smoothly and efficiently.
Key Responsibilities:
  • System Monitoring: Design implement and maintain monitoring systems to track the performance availability and reliability of applications and infrastructure.
  • Incident Management: Troubleshoot resolve and document incidents across various systems to ensure minimal downtime.
  • Automation & Scripting: Write and maintain scripts to automate routine tasks and improve operational efficiency. Experience with scripting languages like Python Bash or similar is essential.
  • Dashboarding: Develop and maintain dashboards to visualize key metrics and system health enabling proactive identification of potential issues.
  • Collaboration: Work closely with development teams to design reliable scalable systems that can handle production traffic.
  • On-call Support: Participate in on-call rotations to ensure 24/7 support for critical infrastructure and services.
  • Capacity Planning: Analyze system capacity and forecast future needs to ensure systems can scale effectively.
Skills & Qualifications:
  • Experience: 6-9 years of hands-on experience in an SRE DevOps or a related role.
  • Scripting Knowledge: Strong proficiency in scripting languages (e.g. Python Bash or similar).
  • Monitoring Tools: In-depth experience with monitoring tools (e.g. Prometheus Grafana Nagios etc..
  • Dashboarding: Expertise in creating visualizations and dashboards that make system performance easy to monitor and understand.
  • Problem-Solving: Strong analytical skills with a demonstrated ability to troubleshoot and resolve complex issues.
  • Communication Skills: Excellent communication skills and the ability to work in a collaborative fast-paced environment.
Preferred Qualifications:
  • Familiarity with cloud platforms (AWS GCP Azure).
  • Experience with containerization and orchestration tools like Docker and Kubernetes.
  • Knowledge of CI/CD pipelines and their implementation in an SRE environment.
  • Previous experience with high-traffic systems and the ability to design for scale.

ci/cd pipelines,automation & scripting,scripting languages (python, bash),on-call support,incident management,system monitoring,infrastructure,dashboarding,site reliability engineer (sre),bash,scripting,cloud platforms (aws, gcp, azure),collaboration,capacity planning,monitoring tools (prometheus, grafana, nagios),containerization (docker, kubernetes)

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.