Role | SRE/OPS |
Location | - Chennai
- Flexibility to travel for business trips
|
Experience: | - 3 to 6 Years of experience
|
What awaits you/ Job Profile
| Our vision is to provide an overarching platform for AI-based quality assurance (AIQX) in the global production network to accelerate the end-to-end quality control cycle in vehicle manufacturing. We develop these in an international feature team based on state-of-the-art technologies in close cooperation with our users in the plants. We are looking for a SRE to join our BMW teams of rock-solid specialists developing and operating our AI-based quality assurance solution for BMWs plants. In this position you will take an important role in maintaining and operating a highly complex platform working in an international team securing the quality of our BMW Products. Our services mainly run on the Microsoft Azure Cloud Platform. If you are a passionate SRE preferably with a developer background willing to take responsibility for our platform sharing knowledge and giving guidance within the team are thrilled about latest technology full of energy and ambition hands-on and not afraid of making your hands dirty this is the right position for you. |
What should you bring along
| - 3 to 6 years of experience in IT operations or a similar role
- Willing and able to travel internationally (twice a year)
Monitor and Operate IT Products: - Perform regular and sporadic operational tasks to ensure optimal performance of IT services
- Own and maintain the Regular OPS Tasks list refining sporadic tasks based on input from the Operations Experts (OE) network
Manage IT Service Continuity: - Prepare for and attend emergency exercises (EE) reviewing outcomes and deriving follow-up tasks
- Communicate findings and improvements to the OE network
Manage Availability: - Participate in Gamedays and backup/restore test sessions practicing and executing backup and restore processes.
- Own the recovery and backup plan reviewing success and identifying follow-up tasks.
Manage Capacity: - Monitor cluster capacity using prepared dashboards and coordinate with the DevOps team for any issues
- Plan and execute capacity extensions as needed
Manage Service Configuration: - Oversee service configuration management using ITSM tools
Manage Events: - Observe dashboards and alerts take action for root cause analysis (RCA) and create tasks for the DevOps team.
- Provide proactive feedback and maintain monitoring and alerting solutions.
Manage Problems: - Conduct root cause analysis and manage known issues creating Jira defects for further assistance if required
Enable Changes: - Create and sync changes with the team assisting with releases and deployment plans.
Manage Service Requests and Incidents: - Observe and resolve service requests and incidents creating Jira tasks for the DevOps team as necessary.
Manage Knowledge: - Create use and extend knowledge articles ensuring availability and consistency.
You take part in 24/7 on-call rotations in a future setup with teams around the world and can restore systems in an efficient manner. |
Must have technical skill | - Strong understanding of IT service management principles and practices
- Proficiency in monitoring and management tools (e.g. dashboards alerting systems)
- Strong analytical and problem-solving abilities particularly in IT service management
- Experience in conducting root cause analysis (RCA) and managing known issues
- Experience in performing regular and sporadic operational tasks to ensure optimal performance of IT services
- Ability to manage IT service continuity availability and capacity effectively
- Experience with change management processes including creating and syncing changes with teams
- Ability to plan and execute capacity extensions and backup/restore processes
- Any additional responsibilities assigned in the Agile Working Model (AWM) Charter
|
Good to have technical skills | - Experience with IT service management frameworks (e.g. ITIL SRE practices)
- Familiarity with cloud platforms (e.g. Azure) and their operational management
- Experience with automation tools (e.g. Ansible Puppet Terraform) and scripting languages (e.g. Python Bash) to streamline operational tasks
- Understanding of DevOps methodologies and practices including CI/CD (Continuous Integration/Continuous Deployment) processes
- Knowledge of network protocols configurations and troubleshooting to support IT infrastructure
- Understanding of IT security best practices and compliance requirements to ensure secure operations
- Skills in data analysis and visualization tools (e.g. Splunk Grafana) to interpret operational metrics and trends
- Above-board work ethics
|
Required Experience:
Senior IC
RoleSRE/OPSLocationChennaiFlexibility to travel for business tripsExperience:3 to 6 Years of experienceWhat awaits you/ Job ProfileOur vision is to provide an overarching platform for AI-based quality assurance (AIQX) in the global production network to accelerate the end-to-end quality control cycl...
Role | SRE/OPS |
Location | - Chennai
- Flexibility to travel for business trips
|
Experience: | - 3 to 6 Years of experience
|
What awaits you/ Job Profile
| Our vision is to provide an overarching platform for AI-based quality assurance (AIQX) in the global production network to accelerate the end-to-end quality control cycle in vehicle manufacturing. We develop these in an international feature team based on state-of-the-art technologies in close cooperation with our users in the plants. We are looking for a SRE to join our BMW teams of rock-solid specialists developing and operating our AI-based quality assurance solution for BMWs plants. In this position you will take an important role in maintaining and operating a highly complex platform working in an international team securing the quality of our BMW Products. Our services mainly run on the Microsoft Azure Cloud Platform. If you are a passionate SRE preferably with a developer background willing to take responsibility for our platform sharing knowledge and giving guidance within the team are thrilled about latest technology full of energy and ambition hands-on and not afraid of making your hands dirty this is the right position for you. |
What should you bring along
| - 3 to 6 years of experience in IT operations or a similar role
- Willing and able to travel internationally (twice a year)
Monitor and Operate IT Products: - Perform regular and sporadic operational tasks to ensure optimal performance of IT services
- Own and maintain the Regular OPS Tasks list refining sporadic tasks based on input from the Operations Experts (OE) network
Manage IT Service Continuity: - Prepare for and attend emergency exercises (EE) reviewing outcomes and deriving follow-up tasks
- Communicate findings and improvements to the OE network
Manage Availability: - Participate in Gamedays and backup/restore test sessions practicing and executing backup and restore processes.
- Own the recovery and backup plan reviewing success and identifying follow-up tasks.
Manage Capacity: - Monitor cluster capacity using prepared dashboards and coordinate with the DevOps team for any issues
- Plan and execute capacity extensions as needed
Manage Service Configuration: - Oversee service configuration management using ITSM tools
Manage Events: - Observe dashboards and alerts take action for root cause analysis (RCA) and create tasks for the DevOps team.
- Provide proactive feedback and maintain monitoring and alerting solutions.
Manage Problems: - Conduct root cause analysis and manage known issues creating Jira defects for further assistance if required
Enable Changes: - Create and sync changes with the team assisting with releases and deployment plans.
Manage Service Requests and Incidents: - Observe and resolve service requests and incidents creating Jira tasks for the DevOps team as necessary.
Manage Knowledge: - Create use and extend knowledge articles ensuring availability and consistency.
You take part in 24/7 on-call rotations in a future setup with teams around the world and can restore systems in an efficient manner. |
Must have technical skill | - Strong understanding of IT service management principles and practices
- Proficiency in monitoring and management tools (e.g. dashboards alerting systems)
- Strong analytical and problem-solving abilities particularly in IT service management
- Experience in conducting root cause analysis (RCA) and managing known issues
- Experience in performing regular and sporadic operational tasks to ensure optimal performance of IT services
- Ability to manage IT service continuity availability and capacity effectively
- Experience with change management processes including creating and syncing changes with teams
- Ability to plan and execute capacity extensions and backup/restore processes
- Any additional responsibilities assigned in the Agile Working Model (AWM) Charter
|
Good to have technical skills | - Experience with IT service management frameworks (e.g. ITIL SRE practices)
- Familiarity with cloud platforms (e.g. Azure) and their operational management
- Experience with automation tools (e.g. Ansible Puppet Terraform) and scripting languages (e.g. Python Bash) to streamline operational tasks
- Understanding of DevOps methodologies and practices including CI/CD (Continuous Integration/Continuous Deployment) processes
- Knowledge of network protocols configurations and troubleshooting to support IT infrastructure
- Understanding of IT security best practices and compliance requirements to ensure secure operations
- Skills in data analysis and visualization tools (e.g. Splunk Grafana) to interpret operational metrics and trends
- Above-board work ethics
|
Required Experience:
Senior IC
View more
View less