Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailJob Title: Observability Lead / Architect
Location: Dallas TX 2-3 days per week at office
Observability Architect New Relic Splunk CloudWatch Kibana APM Monitoring Solutions
Experience Required: 8 Years
Key Responsibilities:
Design and implement end-to-end observability strategies covering metrics logs traces and user experience monitoring
Architect custom monitoring frameworks tailored to specific business applications and infrastructure landscapes
Implement and manage observability platforms including New Relic Splunk AWS CloudWatch and Kibana
Develop and maintain APM scripts synthetic monitors custom dashboards and alerting mechanisms
Integrate observability tools with CI/CD pipelines for proactive issue detection and faster MTTR
Collaborate with application infrastructure DevOps and security teams to ensure observability coverage across systems
Conduct root cause analysis using correlation across metrics logs and traces
Provide technical leadership in observability best practices architecture reviews and roadmap planning
Define and enforce standards for SLAs SLOs and SLIs across environments
Mentor and guide engineering teams in the effective use of observability tools
Key Skills & Technologies:
Monitoring & APM Tools:
Deep experience with New Relic (including APM infrastructure synthetics custom instrumentation)
Strong proficiency in Splunk (querying dashboards alerts ingestion pipeline design)
Hands-on with AWS CloudWatch (metrics logs alarms insights)
Working knowledge of Kibana and Elastic Stack (ELK)
Scripting & Customization:
Experience in APM scripting custom instrumentation (using Java Python or agents)
Ability to create synthetic monitors custom event generators and automated dashboards
Familiarity with Terraform CloudFormation or scripting languages (Shell Python) for observability automation
Architecture & Integration:
Expertise in designing observability frameworks for cloud-native (AWS/GCP/Azure) and hybrid environments
Understanding of distributed systems microservices and event-driven architectures
Ability to integrate observability platforms with DevOps pipelines incident response and ITSM tools
Full-time