DescriptionWere passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set high availability and stellar performance level to pursue their missions. We are currently seeking a public cloud experienced engineer for planning designing and implementing next generation cloud infrastructure solutions. Cloud Engineer will be a part of the Engineering team and will require a strong knowledge of application monitoring infrastructure monitoring automation maintenance and Service Reliability Improvements.
Specifically we are searching for someone who brings fresh ideas demonstrates a unique and informed viewpoint and enjoys collaborating with a crossfunctional team to develop realworld solutions and positive user experiences at every interaction.
Reasons why you would love joining Ford:
- Belong to a leading company in the Automotive industry globally with more than 120 years in the market
- Fulltime fixed contract with a competitive starting compensation and a benefits package (restaurant card discounts etc.
- Worklife balance: 33 vacation days and work under a hybrid model 2/3 days a week)
- Career Development path being part of highimpact projects which would allow you to improve your technical skills and develop.
Responsibilities- Design automate and manage a highly available and scalable cloud deployment that allows development teams to deploy and run their services.
- Collaborating with engineering and Architects teams to evaluate and identify optimal cloud solutions also leveraging scalability highperformance and security.
- Modernise existing onprem solution and improving existing systems.
- Extensively automated deployments and managed applications in GCP.
- Developing and maintaining cloud solutions in accordance with best practices.
- Ensuring efficient functioning of data storage and processing functions in accordance with company security policies and best practices in cloud security.
- Collaborate with Engineering teams to identify optimization strategies help develop selfhealing capabilities
- Experience in developing a strong observability capabilities
- Identifying analysing and resolving infrastructure vulnerabilities and application deployment issues.
- Regularly reviewing existing systems and making recommendations for improvements.
Qualifications- The candidate should possess a strong understanding of effective data visualization techniques choosing the right chart types for different data and creating clear and informative dashboard
- Familiarity with the DORA metrics (Deployment Frequency Lead Time for Changes Change Failure Rate Time to Restore Service) and their importance in assessing DevOps performance is essential
- Understanding how CI/CD events integrate with observability is vital. They should know how to correlate events from CI/CD pipelines with performance and error data
- The engineer should have experience working with RESTful APIs both consuming and potentially building them. This is crucial for integrating with various services and collecting data
- Understanding API specifications (Swagger/OpenAPI) is beneficial for interacting with various services.
- Building and maintaining dashboards in Grafana connecting to various data sources (including GCP services).
- Implementing and managing SLOs using Nobl9
- Optimizing BigQuery queries for performance
- Automating data collection and processing using Cloud Scheduler and other GCP services
- Proficiency in using Dynatrace for application performance monitoring (APM) is critical. This includes setting up monitoring analyzing performance bottlenecks and creating dashboards. They should understand Dynatraces data model and its capabilities
- Experience with Nobl9 for reliability management is a valuable asset. This shows an understanding of SRE principles and the ability to define and monitor SLOs (Service Level Objectives) and error budgets.
- Strong Grafana skills are essential for dashboard creation and visualization. They should be able to build interactive dashboards create custom visualizations and connect to various data sources
- Understanding Prometheuss role in metrics collection and its architecture is important. This often involves configuring exporters and understanding how to query and analyze metrics
- Deep understanding of the OpenTelemetry Collectors configuration pipelines and processing capabilities is crucial. This includes choosing appropriate exporters and processors based on the environment and requirements
- Proven work experience in Docker
- Proven working experience in API gateway Apigee is an advantage
- Experience in package config and deployment management via Helm Kustomize ArgoCD.
- Strong knowledge in Github DevOps (Cloud Build and Deploy is an advantage)
- Should be proficient in scripting and coding that include traditional languages like Python GoLangJava JS and Node.js
- Exposure to Cloud Monitoring and logging
- Experience with distributed storage technologies like NFS HDFS Ceph S3 as well as dynamic resource management frameworks (Mesos Kubernetes Yarn)
- Experience with automation tools should be a priority
- Languages: English advanced/high level needed
Additional Information
Ford is committed to diversity and equality of opportunity for all and is opposed to any form of less favourable treatment or harassment on the grounds of gender marital status civil partnership status parental status race ethnic origin colour nationality national origin disability sexual orientation religion/belief gender reassignment and gender identity age and those with caring responsibilities.
#LIHybrid
#LIAH2