Job Title: Grafana And Prometheus
Location: Melbourne FL Dallas Frisco TX & Cary NC.
Job Description:
Grafana and Prometheus
- Expertise in installing configuring and scaling Prometheus components.
- Experience in evaluation of Prometheus monitoring and alerting infrastructure.
- Expertise in install and configure required exporter to collect and expose metrics from different applications services and systems.
- Collaborate development and operations teams to integrate Prometheus exporters into existing applications and systems
- Conduct thorough testing and Prometheus exporters to ensure accuracy reliability and performance of metrics collection.
- Experience in Prometheus integration with other tools.
- Experience in configuring custom exporter
- Experience in using Prometheus APIs and libraries
- Develop and implement custom dashboards for monitoring key metrics.
- Troubleshoot issues ensure data accuracy and optimize query performance.
- Design and manage alerting rules for proactive issue identification and resolution.
- Continuously improve and expand monitoring coverage to meet evolving needs.
- Collaborate with teams to define alert thresholds and escalation procedures.
- Analyse metrics data to identify performance bottlenecks and areas for improvement.
- Create meaningful visualizations and reports to provide insights for stakeholders.
- Contribute to the enhancement of data retention and archiving strategies.
- Collaborate with the infrastructure team to ensure seamless integration and scalability of Grafana and Prometheus.
- Should collects generate or help refine high level requirements and creates implementation strategy acceptance criteria ( with inputs from customer) and test cases.
- Ability to understand complex architecture and applications
- Proficiency in creating custom Grafana dashboards and PromQL queries.
- Strong understanding of monitoring best practices alerting and data analysis.
- Knowledge of time-series databases and storage strategies.
- Scripting and automation skills for efficient system management.
- Excellent troubleshooting and problem-solving abilities.
- Strong communication and collaboration skills.
- Knowledge of DevOps practices and tools.
- Relevant certifications are a plus.
Customer facing experience is must