Position Overview
As aSystems & Infrastructure Engineer you will support andmaintainour Linux-based data analytics platform. You will be responsible forsystem lifecycle management platform reliability containerized workloads and operational compliance in a regulated environment. The ideal candidate has hands-on experience withUbuntu Linux understands moderncontainerization and orchestration technologies (Docker/Kubernetes) and thrives in a distributed technically complex data-centric environment.
Essential Duties & Responsibilities:
Platform Operations & System Administration
- Install configure upgrade and decommission:
- BIOS and firmware
- Ubuntu/Linux operating systems
- System-level packages software applications modules and dependencies
- Manage andmaintainvirtualization or container environments including Docker and Kubernetes workloads
- Monitor system resourceutilization scalability and performance of compute nodes and platform services.
- Perform routine system health checks vulnerability assessments and patch management
- Troubleshoot and resolve Linux OS issuescomputeenvironment problems network connectivity concerns storage issues and node-level failures
Platform Management & User Operations
- Handle daily operational requests including:
- User management access provisioning and permissions updates
- Data access requests and entitlement adjustments
- Break-fix support and incident response
- Ticket queue management documenting workin accordance withSLAs
- Collaborate with engineering analytics and DevOps teams to supportenvironmentstability and improvements
- Ensure high availability of critical platform services used by computation data analysis and ETL workflows
Security Compliance & Audit Support
- Maintain environment compliance withSOC 2 HIPAA and PCIrequirements through year-round operational discipline
- Implement andvalidatesecurity controls such as:
- Patch management
- Access controls and logging
- Vulnerability remediation
- Configuration management and change tracking
- Document platform changes architecture and controls to support compliance
- Provide audit support annually through evidence collection system reports configuration exports and control demonstrations
Automation & Reliability Engineering
- Develop automation scripts using Bash Python or similar languages to streamline operational processes
- Enhance system reliability through:
- Infrastructure-as-Code templates (e.g. Terraform Ansible)
- Automated deployments and environment builds
- Monitoring and alerting improvements
- Participatein capacity planning performance tuning and architectural enhancements for high-volumecomputeand analytics workloads
Systems Engineering in a Computational Analytics Environment
- Managecomputeclusters supporting data science analytics and batch workloads
- Oversee job scheduling environments (Kubernetes jobs Cron workflow schedulers)
- Support distributed file systems object storage or high-throughput data pipelines as needed
- Maintain security and operational continuity across multi-node environments
Required Skills:
Required
- 36 years of hands-on experience withUbuntu/Linux system administration
- Working knowledge ofDocker and Kubernetesin a production environment
- Experience with system patching kernel upgrades firmware/BIOS updates and environment hardening
- Familiarity with security best practices access control and compliance-driven operations
- Strong troubleshooting skills across systems networking and application layers
- Scripting experience (Bash Python or similar)
- Experience working in remote distributed teams
Preferred
- Experience supporting a high-performance computing (HPC) large-scale analytics or distributedcomputeenvironment
- Exposure to CI/CD pipelinesGitOps or automated infrastructure provisioning
- Understanding ofSOC2/HIPAA/PCI controls audits or regulated computing environments
- Experience with monitoring tools (Prometheus Grafana Zabbix etc.)
Soft Skills
- Strong communicationskills and ability to document clearly
- Attention to detail especiallyregardingcompliance requirements
- Ability to work independently manage priorities and meet operational SLAs
- Proactive mindset with a drive to automate and improve platform
Whats in it for You
- Opportunity to work in the booming field of cloud data management and analytics alongside some of the brightest minds in the industry
- Opportunity to work with cutting-edge technology
- Chance to work with a rapidly expanding US tech company
- Flexible schedule and paid time off
- Competitive salary and benefits package
Position OverviewAs aSystems & Infrastructure Engineer you will support andmaintainour Linux-based data analytics platform. You will be responsible forsystem lifecycle management platform reliability containerized workloads and operational compliance in a regulated environment. The ideal candidate h...
Position Overview
As aSystems & Infrastructure Engineer you will support andmaintainour Linux-based data analytics platform. You will be responsible forsystem lifecycle management platform reliability containerized workloads and operational compliance in a regulated environment. The ideal candidate has hands-on experience withUbuntu Linux understands moderncontainerization and orchestration technologies (Docker/Kubernetes) and thrives in a distributed technically complex data-centric environment.
Essential Duties & Responsibilities:
Platform Operations & System Administration
- Install configure upgrade and decommission:
- BIOS and firmware
- Ubuntu/Linux operating systems
- System-level packages software applications modules and dependencies
- Manage andmaintainvirtualization or container environments including Docker and Kubernetes workloads
- Monitor system resourceutilization scalability and performance of compute nodes and platform services.
- Perform routine system health checks vulnerability assessments and patch management
- Troubleshoot and resolve Linux OS issuescomputeenvironment problems network connectivity concerns storage issues and node-level failures
Platform Management & User Operations
- Handle daily operational requests including:
- User management access provisioning and permissions updates
- Data access requests and entitlement adjustments
- Break-fix support and incident response
- Ticket queue management documenting workin accordance withSLAs
- Collaborate with engineering analytics and DevOps teams to supportenvironmentstability and improvements
- Ensure high availability of critical platform services used by computation data analysis and ETL workflows
Security Compliance & Audit Support
- Maintain environment compliance withSOC 2 HIPAA and PCIrequirements through year-round operational discipline
- Implement andvalidatesecurity controls such as:
- Patch management
- Access controls and logging
- Vulnerability remediation
- Configuration management and change tracking
- Document platform changes architecture and controls to support compliance
- Provide audit support annually through evidence collection system reports configuration exports and control demonstrations
Automation & Reliability Engineering
- Develop automation scripts using Bash Python or similar languages to streamline operational processes
- Enhance system reliability through:
- Infrastructure-as-Code templates (e.g. Terraform Ansible)
- Automated deployments and environment builds
- Monitoring and alerting improvements
- Participatein capacity planning performance tuning and architectural enhancements for high-volumecomputeand analytics workloads
Systems Engineering in a Computational Analytics Environment
- Managecomputeclusters supporting data science analytics and batch workloads
- Oversee job scheduling environments (Kubernetes jobs Cron workflow schedulers)
- Support distributed file systems object storage or high-throughput data pipelines as needed
- Maintain security and operational continuity across multi-node environments
Required Skills:
Required
- 36 years of hands-on experience withUbuntu/Linux system administration
- Working knowledge ofDocker and Kubernetesin a production environment
- Experience with system patching kernel upgrades firmware/BIOS updates and environment hardening
- Familiarity with security best practices access control and compliance-driven operations
- Strong troubleshooting skills across systems networking and application layers
- Scripting experience (Bash Python or similar)
- Experience working in remote distributed teams
Preferred
- Experience supporting a high-performance computing (HPC) large-scale analytics or distributedcomputeenvironment
- Exposure to CI/CD pipelinesGitOps or automated infrastructure provisioning
- Understanding ofSOC2/HIPAA/PCI controls audits or regulated computing environments
- Experience with monitoring tools (Prometheus Grafana Zabbix etc.)
Soft Skills
- Strong communicationskills and ability to document clearly
- Attention to detail especiallyregardingcompliance requirements
- Ability to work independently manage priorities and meet operational SLAs
- Proactive mindset with a drive to automate and improve platform
Whats in it for You
- Opportunity to work in the booming field of cloud data management and analytics alongside some of the brightest minds in the industry
- Opportunity to work with cutting-edge technology
- Chance to work with a rapidly expanding US tech company
- Flexible schedule and paid time off
- Competitive salary and benefits package
View more
View less