Overview:
TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation information technology and services
Position: Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer
Location: Atlanta GA / Frisco TX
Duration: 9 Months
Job Type: Temporary Assignment
Work Type: Hybrid
Job Description:
-
We are looking for a highly skilled Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer to lead the design implementation and optimization of our monitoring and observability ecosystem.
-
The ideal candidate will be an expert in Splunk with a strong background in enterprise IT infrastructure system performance monitoring and log analytics.
-
You will play a pivotal role in ensuring end-to-end visibility across our systems applications and services.
Key Responsibilities:
Splunk Administration & Engineering:
-
Serve as the SME for Splunk architecture deployment and configuration across the enterprise.
-
Maintain and optimize Splunk infrastructure including indexers forwarders search heads and clusters.
-
Develop and manage custom dashboards alerts saved searches and visualizations.
-
Implement and tune log ingestion pipelines using Splunk Universal Forwarders HTTP Event Collector and other data inputs.
-
Ensure high availability scalability and performance of the Splunk environment.
-
Creating dashboards Reports Alerts Advance Splunk Search Visualization log parsing and external table lookups
-
Expertise with SPL (Search Processing Language ) and understanding of Splunk architecture including configuration files.
-
Wide Experience in monitoring and troubleshooting applications using tools like AppDynamics Splunk Grafana Argos OTEL etc. to build observability for large-scale microservice deployments.
-
Creating dashboards for various applications to monitor health network issues and configure alerts.
-
Excellent problem-solving triaging and debugging skills in large-scale distributed systems
-
Establishing and documenting run books and guidelines for using the multi-cloud infrastructure and microservices platform.
-
Experience in optimized search queries using summary indexing.
-
Solid knowledge and experience in monitoring the Splunk infrastructure.
-
Develop a long-term strategy and roadmap for AI/ML tooling to support the AI capabilities across the Splunk portfolio.
-
Diagnose and resolve network-related issues affecting CI/CD pipelines debug DNS firewall proxy and SSL/TLS problems and use tools like tcpdump curl and netstat for proactive maintenance.
Enterprise Monitoring & Observability
-
Design and implement holistic enterprise monitoring solutions integrating Splunk with tools like AppDynamics Dynatrace Prometheus Grafana SolarWinds or others.
-
Collaborate with application infrastructure and security teams to define monitoring KPIs SLAs and alert thresholds.
-
Build end-to-end visibility into application performance system health and user experience.
-
Integrate Splunk with ITSM platforms (e.g. ServiceNow) for event and incident management automation.
Operations Troubleshooting & Optimization
-
Perform data onboarding parsing and field extraction for structured and unstructured data sources.
-
Support incident response and root cause analysis using Splunk for troubleshooting and forensics.
-
Regularly audit and optimize search performance data retention policies and index lifecycle management.
-
Create runbooks documentation and SOPs for Splunk and monitoring tool usage.
Required Qualifications:
-
5 years of experience in IT infrastructure DevOps or monitoring roles.
-
3 years of hands-on experience with Splunk Enterprise as an admin architect or engineer.
-
Experience designing and managing large-scale multi-site Splunk deployments.
-
Strong skills in SPL (Search Processing Language) dashboard design and alerting strategies.
-
Familiarity with Linux systems scripting (e.g. Bash Python) and APIs.
-
Experience with enterprise monitoring tools and integration with Splunk (e.g. AppDynamics Dynatrace Nagios Zabbix etc.).
-
Understanding of logging metrics and tracing in modern environments (on-prem and cloud).
-
Strong understanding of network protocols system logs and application telemetry.
Preferred Qualifications:
-
Splunk certifications (e.g. Splunk Certified Power User Admin Architect).
-
Experience with Splunk ITSI Enterprise Security or Observability Suite.
-
Knowledge of cloud-native environments (AWS Azure or GCP) and cloud monitoring integrations.
-
Experience with log aggregation security event monitoring or compliance (e.g. PCI HIPAA SOX).
-
Familiarity with CI/CD pipelines and GitOps practices.
TekWissen Group is an equal opportunity employer supporting workforce diversity.