Application Operational Support Site Reliability Engineer

VDart Inc

Not Interested
Bookmark
Report This Job

profile Job Location:

Irving, TX - USA

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

Role: Application Operational Support / Site Reliability Engineer

Location: Irving TX or Charlotte NC (Hybrid)

Type: Contract

Role Summary

  • We are seeking a highly skilled Application Operational Support / Site Reliability Engineer to support and operate mission-critical enterprise applications in a highly regulated environment. This role is responsible for ensuring platform reliability availability and operational excellence through strong CI/CD practices observability incident management and customer-facing remediation.
  • The ideal candidate combines strong technical troubleshooting skills with disciplined operational practices and the ability to work independently with stakeholders

Key Responsibilities

  • Support production and pre-production environments to ensure high availability performance and stability of enterprise applications.
  • Support and maintain CI/CD pipelines using tools such as GitHub Actions Harness or similar.
  • Partner with engineering teams to improve deployment reliability reduce manual steps and enable repeatable releases.
  • Assist with deployment automation and release coordination across environments.
  • Execute Incident Change and Problem Management processes using ServiceNow.
  • Lead or contribute to major incident calls ensuring clear communication coordination and resolution.
  • Perform root cause analysis and drive permanent fixes through problem management practices.
  • Monitor application and platform health using tools such as Splunk Grafana AppDynamics or equivalent.
  • Configure dashboards alerts and monitoring thresholds to proactively identify issues.
  • Use telemetry data to identify performance bottlenecks and reliability risks.
  • Partner with application infrastructure and security teams to resolve complex cross-functional issues.
  • Identify operational gaps and recommend improvements to tooling processes and automation.
  • Contribute to runbooks operational documentation and standard operating procedures.
  • Support platform modernization initiatives aligned with reliability and scalability goals.

Required Skills & Experience

  • Core Skills
  • 5 years of experience in application/platform operations production support or SRE roles.
  • 3 years of experience with CI/CD pipelines (GitHub Actions Harness or similar tools).
  • Solid understanding of Incident Change and Problem Management processes preferably using ServiceNow.
  • 2 years of experience with observability and monitoring tools such as Splunk Grafana AppDynamics or equivalent.
  • Excellent troubleshooting and critical thinking skills with the ability to diagnose complex production issues.
  • Proven experience interacting directly with customers or business stakeholders during operational events.

Technical Competencies

  • Strong understanding of application deployment runtime environments and system dependencies.
  • Ability to read logs metrics and traces to identify root causes.
  • Familiarity with cloud-native or hybrid enterprise environments.

Nice-to-Have Skills:

  • Experience with VM image creation/build processes.
  • Exposure to OpenShift / OCP or Kubernetes-based platforms.
  • Experience operating in regulated environments (banking financial services).
Role: Application Operational Support / Site Reliability Engineer Location: Irving TX or Charlotte NC (Hybrid) Type: Contract Role Summary We are seeking a highly skilled Application Operational Support / Site Reliability Engineer to support and operate mission-critical enterprise applications in ...
View more view more

Key Skills

  • Splunk
  • Iis
  • SQL
  • .NET
  • Perl
  • Shell Scripting
  • Weblogic
  • Java
  • Sybase
  • Scripting
  • Oracle
  • Application Support