TekWissen is a global workforce management provider throughout India and many other countries in the world. The below client is a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation we have sought to make the world a better place one that benefits lives communities and the planet
Job Title: Site Reliability Engineering Engineer III
Location: Chennai
Work Type: Hybrid
Position Description:
- Employees in this job function are responsible for ensuring availability reliability and performance of cloud and network systems and services by automating routine manual tasks
Key Responsibilities:
- Write configure and deploy code that improves service reliability for existing or new systems; set standard for others with respect to code quality.
- Provide helpful and actionable feedback and review for code or production changes.
- Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors.
- Lead debugging troubleshooting and analysis of service architecture and design.
- Participate in on-call rotation.
- Write documentation: design system analysis runbooks playbooks. Provide design feedback and uplevel design skills of others.
- Implement and manage SRE monitoring application backends using Golang Postgres and OpenTelemetry.
- Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms.
- Work within GCP infrastructure optimizing performance and cost and scaling resources to meet demand.
- Collaborate with development teams to enhance system reliability and performance applying a platform engineering mindset to system administration tasks.
- Develop and maintain automated solutions for operational aspects such as on-call monitoring performance tuning and disaster recovery.
- Troubleshoot and resolve issues in our dev test and production environments.
- Participate in postmortem analysis and create preventative measures for future incidents.
- Implement and maintain security best practices across our infrastructure ensuring compliance with industry standards and internal policies.
- Participate in security audits and vulnerability assessments.
- Identify and address performance bottlenecks through code profiling system analysis and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues.
- Contribute to internal knowledge bases and documentation.
Skills Required:
Experience Required:
- Engineer III Exp: 2 coding lang. or adv. 1 lang. 6 years in IT; 4 years in development
Education Required:
TekWissen Group is an equal opportunity employer supporting workforce diversity.
Overview: TekWissen is a global workforce management provider throughout India and many other countries in the world. The below client is a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation we have sought to make the ...
TekWissen is a global workforce management provider throughout India and many other countries in the world. The below client is a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation we have sought to make the world a better place one that benefits lives communities and the planet
Job Title: Site Reliability Engineering Engineer III
Location: Chennai
Work Type: Hybrid
Position Description:
- Employees in this job function are responsible for ensuring availability reliability and performance of cloud and network systems and services by automating routine manual tasks
Key Responsibilities:
- Write configure and deploy code that improves service reliability for existing or new systems; set standard for others with respect to code quality.
- Provide helpful and actionable feedback and review for code or production changes.
- Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors.
- Lead debugging troubleshooting and analysis of service architecture and design.
- Participate in on-call rotation.
- Write documentation: design system analysis runbooks playbooks. Provide design feedback and uplevel design skills of others.
- Implement and manage SRE monitoring application backends using Golang Postgres and OpenTelemetry.
- Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms.
- Work within GCP infrastructure optimizing performance and cost and scaling resources to meet demand.
- Collaborate with development teams to enhance system reliability and performance applying a platform engineering mindset to system administration tasks.
- Develop and maintain automated solutions for operational aspects such as on-call monitoring performance tuning and disaster recovery.
- Troubleshoot and resolve issues in our dev test and production environments.
- Participate in postmortem analysis and create preventative measures for future incidents.
- Implement and maintain security best practices across our infrastructure ensuring compliance with industry standards and internal policies.
- Participate in security audits and vulnerability assessments.
- Identify and address performance bottlenecks through code profiling system analysis and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues.
- Contribute to internal knowledge bases and documentation.
Skills Required:
Experience Required:
- Engineer III Exp: 2 coding lang. or adv. 1 lang. 6 years in IT; 4 years in development
Education Required:
TekWissen Group is an equal opportunity employer supporting workforce diversity.
View more
View less