Site Reliability Engineer I

Tekion

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 7 hours ago
Vacancies: 1 Vacancy

Job Summary

About Tekion:

Positively disrupting an industry that has not seen any innovation in over 50 years Tekion has challenged the paradigm with the first and fastest cloud-native automotive platform that includes the revolutionary Automotive Retail Cloud (ARC) for retailers Automotive Enterprise Cloud (AEC) for manufacturers and other large automotive enterprises and Automotive Partner Cloud (APC) for technology and industry partners. Tekion connects the entire spectrum of the automotive retail ecosystem through one seamless platform. The transformative platform uses cutting-edge technology big data machine learning and AI to seamlessly bring together OEMs retailers/dealers and consumers. With its highly configurable integration and greater customer engagement capabilities Tekion is enabling the best automotive retail experiences ever. Tekion employs close to 3000 people across North America Asia and Europe.

About the Role:

We are looking for an energetic SRE 1 to join our Infrastructure team! We are on a mission to build a world-class observability platform. If you love digging into data automating boring tasks and making systems talk to us through metrics and traces we want to meet you. You wont just be watching screens; you will be building the tools that keep our platform reliable and helping our developers sleep better at night.

What Youll Be Doing:

Driving OpenTelemetry Adoption: You will help us migrate from legacy agents (like Filebeat) to the OpenTelemetry (OTel) Collector. Youll tackle cool challenges like standardizing log formats and implementing tail-based sampling.

Kubernetes Automation: Work on creating standardized base images with pre-configured OTel agents to make life easier for our developers.

Taming the Data: Help us solve High Cardinality issues. Youll refine our metric ingestion strategies to ensure we are storing useful data without exploding our storage costs.

Tooling & Dashboards: Build actionable dashboards and alerts in New Relic Observe and Grafana. Youll help ensure our alerts actually mean something (reducing noise!).

Incident Response: Participate in the on-call rotation using tools like Pagerduty to detect and resolve issues fast.

Coding & Scripting: Use Python Go or Bash to automate manual operational tasks.

What We Are Looking For:

Experience: 13 years of experience in SRE DevOps or Software Engineering.

Kubernetes Pro: You know your way around K8s clusters pods and deployments. Prometheus/Grafana or ELK Datadog).

Observability Obsessed: Experience with at least one major tool (New Relic

Cloud Native: Familiarity with public clouds (AWS GCP or Azure).

Coding Skills: Comfortable writing scripts in Python Go or Java.

Curious Mindset: You ask why when a system breaks and how we can prevent it next time.

Bonus Points (Nice to Have):

Experience specifically with OpenTelemetry (OTel) instrumentation.

Knowledge of Terraform or Ansible.

Understanding of SLOs/SLIs and Golden Signals.

Experience optimizing observability costs (ingestion volume data retention).

Tekion is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race religion color national origin gender (including pregnancy childbirth or related medical conditions) sexual orientation gender identity gender expression age status as a protected veteran status as an individual with a disability victim of violence or having a family member who is a victim of violence the intersectionality of two or more protected categories or other applicable legally protected characteristics.

For more information on our privacy practices please refer to our Applicant Privacy Notice here.


Required Experience:

IC

About Tekion:Positively disrupting an industry that has not seen any innovation in over 50 years Tekion has challenged the paradigm with the first and fastest cloud-native automotive platform that includes the revolutionary Automotive Retail Cloud (ARC) for retailers Automotive Enterprise Cloud (AEC...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

One platform that seamlessly connects your entire automotive retail business. Unify DMS, CRM, Digital Retail, Analytics, and more. Request a demo.

View Profile View Profile