Job DescriptionWe partner with the most important institutions in the world to transform how they use data and technology. Our software has been used to stop terrorist attacks discover new medicines gain an edge in global financial markets and more. If these types of projects excite you wed love for you to join us.
The Role:Youll be a key part of a cross-functional team that helps critical institutions solve their most pressing problems. Together with a team of analysts developers technical project managers and systems experts you will be directly responsible for keeping a Palantir system running smoothly and securely. Youll serve as the teams expert and owner for all things systems administration. You are the first line of defense against a variety of threats to the uptime of your servers.
On calm days youll be proactively keeping things safe and secure through the implementation of industry best practices security updates and Palantir developed systems automation. When the unexpected occurs youll follow the trail of monitoring alerts and log messages to the source of the trouble triaging outages and working with your team to understand what went wrong and how to fix it for good.
Core Responsibilities- Administer enterprise Linux servers including operating system patching security hardening monitoring and troubleshooting.
- Administer AWS cloud accounts with Terraform as well as troubleshooting and debugging via the AWS Console and CLI
- Handle the operations of data storage and indexing systems including monitoring backup management and upgrades.
- Configure and maintain web servers including monitoring and configuration management.
- Work with customer IT teams to coordinate changes and troubleshoot intersystem problems
Technologies We Use- Amazon Web Services and on-premises servers
- CentOS and Red Hat Enterprise Linux
- Prometheus and Grafana
- Oracle Postgres Cassandra and Elasticsearch
- NGINX and Envoy HTTP servers
- Puppet Ansible Python and shell scripting
What We Value- Experience with Linux system administration
- Ability to troubleshoot server hardware failures.
- Understanding of Amazon Web Services
- Applied knowledge in operating system security and hardening.
- Ability to automate repetitive tasks using Ansible Python or similar language.
- Experience in patch and configuration management in enterprise production environments
- Exposure to the operation and configuration of database and web server technologies
- Unwavering commitment to operational security and best practices
Certificates/Security Clearances/OtherTop secret/SCI
Required Experience:
Unclear Seniority