Systems Engineer

Irving, TX - USA

Monthly Salary: USD 105000 - 115000

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Department:

Insurance

Job Summary

Description

In office Position

POSITION SUMMARY:

The Senior Systems Engineer is a hands-on senior individual contributor responsible for designing building and operating TRISTARs core infrastructure platform with a strong emphasis on Linux systems Kubernetes and automation. This role will own the Kubernetes platform end-to-endcluster build lifecycle management operational standards reliability and day-2 operationswhile partnering closely with development teams as TRISTAR transitions toward a DevOps operating model. Success in this role requires deep technical ownership strong troubleshooting skills across distributed systems and the ability to improve reliability through thoughtful design observability and repeatable automation.

ESSENTIAL DUTIES AND RESPONSIBILITIES:

Kubernetes Platform Engineering & Lifecycle:

Design build and operate Kubernetes clusters in production including upgrades

patching scaling and reliability improvements.

Establish platform standards and operating practices as the environment matures

(cluster configuration access patterns resource governance and runbooks).

Serve as the senior escalation point for Kubernetes platform issues and drive resolution

through root-cause analysis and prevention.

Kubernetes Storage Backup/Restore & Disaster Recovery:

Design and implement Kubernetes storage patterns (StorageClasses PV/PVC lifecycle

capacity planning) and support stateful workloads.

Implement test and maintain Kubernetes-native backup/restore and recovery

procedures.

Integrate Kubernetes persistence needs with enterprise storage platforms including Dell

ObjectScale and existing virtualization/storage systems.

Ingress Load Balancing & Kubernetes Networking:

Own Kubernetes traffic entry including ingress controllers load balancers routing

patterns and TLS/certificate handling.

Define repeatable patterns for exposing services and troubleshooting connectivity across

platform components.

Linux Systems Engineering:

Administer and harden Linux systems that support the platform including patching

performance tuning service reliability logging and baseline configuration.

Troubleshoot system and platform issues across compute storage and network

dependencies.

Automation Scripting & API Integrations:

Build automation to reduce manual work and increase consistency across infrastructure

operations using Python/PowerShell/Bash and API-driven workflows.

Evaluate recommend and help implement an automation / configuration management

approach (tooling patterns and standards) to support repeatable tasks such as

provisioning configuration enforcement patching drift detection and validation.

Develop reusable automation assets (modules/playbooks/templates/scripts) and

establish version-controlled workflows (Git) documentation and operational handoff

practices.

Leverage RESTful APIs to integrate systems and create operational workflows (health

checks reporting event-driven automations and change validation).

Monitoring Alert Response & Operational Reporting:

Monitor alert sources and observability tooling (including SolarWinds on-prem)

investigate events and drive issues to completion.

Document incidents actions taken and final resolutions contribute to improved alerting

quality and operational visibility.

Data Center Support (Occasional):

Provide occasional on-site support as needed in the data center for infrastructure prep

and troubleshooting (racking equipment cabling and physical connectivity verification).

Maintain working familiarity with server hardware and data center best practices to

support rare hands-on needs.

Cloud Readiness & Future-State Hosting:

Partner with development and infrastructure teams to plan and progress TRISTARs

long-term transition toward cloud-hosted deployments of the application stack

Contribute to cloud design discussions with a practical understanding of core cloud

concepts (networking identity/access security reliability scalability and cost

considerations) across major providers (AWS/Azure/GCP).

Translate application and platform requirements into cloud-ready operational patterns

(container orchestration in cloud managed services vs self-managed tradeoffs

environment isolation per client and deployment repeatability).

Support early-stage cloud initiatives such as proofs of concept reference architectures

and migration planning including identifying skill/tooling gaps and recommending

realistic next steps.

Apply Infrastructure-as-Code and automation principles to cloud readiness efforts to

ensure future deployments are repeatable supportable and auditable.

Documentation & Technical Standards:

Create and maintain IT documentation including platform runbooks operational

procedures and architecture/standards documentation.

Collaboration Service Desk Support & Cross-Team Execution:

Work with the Manager Network Services and general IT staff to analyze and resolve

technical issues affecting infrastructure and applications.

Partner closely with development teams as part of TRISTARs DevOps transition to

improve operability deployment reliability and platform usability.

Work alongside the service desk to remedy end-user workstation issues; backfill and

answer service desk calls when required.

Schedule Flexibility & Travel:

Perform night/day/weekend work as required to meet project objectives and support

maintenance windows.

Traveling to remote sites is rare but possible and may be required as needed

Qualifications

QUALIFICATIONS REQUIRED:

Education/Experience: Bachelors degree in a related field (preferred); minimum of 7-year

related experience; or equivalent combination of education and experience.

Knowledge Skills and Abilities:

7 years of progressively responsible experience in systems/infrastructure engineering

with strong production experience in Linux administration.

Hands-on production experience with Kubernetes including cluster build and lifecycle

management (architecture upgrades patching scaling troubleshooting).

Strong understanding of Kubernetes storage and stateful workload operations including

troubleshooting PV/PVC and storage provisioning patterns.

Experience implementing Kubernetes-native backup/restore practices and validating

recovery procedures.

Demonstrated automation experience using scripting (Python/PowerShell/Bash) and

leveraging RESTful APIs for systems integration and automation.

Experience with monitoring/observability platforms and operational alerting; SolarWinds

experience strongly preferred.

Strong troubleshooting skills across distributed systems networking fundamentals and

infrastructure dependencies.

Strong written and verbal communication skills including

documentation/runbooks/standards.

EQUIPMENT OPERATED/USED: Computer 10-key printer copier fax machine and other

office equipment.

SPECIAL EQUIPMENT OR CLOTHING: Appropriate office attire.

Required Experience:

Senior IC

DescriptionIn office PositionPOSITION SUMMARY:The Senior Systems Engineer is a hands-on senior individual contributor responsible for designing building and operating TRISTARs core infrastructure platform with a strong emphasis on Linux systems Kubernetes and automation. This role will own the Kuber...