Site Reliability Engineer - Remote

PayNearMe

Posted on : 03-06-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Santa Clara - USA

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 03-06-2025

Job Description

As our Site Reliability Engineer you will design build and maintain the systems and infrastructure that power our applications ensuring their reliability scalability and performance. You will bring a software engineering approach to operations automating processes and continuously improving the infrastructure and tools to support our business needs.

What youll do:

Infrastructure Management: Design implement and maintain scalable and resilient infrastructure using Terraform for infrastructure as code ensuring high availability and performance.
Kubernetes and Containers: Deploy manage and optimize Kubernetes clusters and containerized applications using Docker. Implement best practices for container orchestration and management.
Systems and Application Monitoring/Observability: Develop and maintain comprehensive monitoring and observability solutions using Datadog. Ensure detailed visibility into system performance and application health.
SLOs and SLA Management: Define monitor and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure reliable and consistent service delivery.
Incident Response and Troubleshooting: Respond to incidents perform root cause analysis and implement solutions to prevent recurrence. Participate in post-incident reviews and contribute to blameless postmortems.
Reliability and Production Environment Management: Ensure the reliability and stability of our production environments. Continuously assess and improve system reliability identifying and addressing potential points of failure.
Automation and Scripting: Develop automation scripts and tools to reduce manual intervention and improve system reliability using Python Bash or Go. Implement and improve CI/CD pipelines.
CI/CD Pipeline Management: Enhance and maintain continuous integration and continuous deployment pipelines using GitLab CI. Ensure seamless and reliable deployment processes.
Capacity Planning and Scaling: Assist in capacity planning and ensure that systems are scalable to meet future demands. Implement auto-scaling strategies where applicable.
Security and Compliance: Implement security best practices and ensure compliance with industry standards. Regularly review and update security policies and procedures.
Collaboration and Support: Work closely with development teams to ensure reliability and scalability of new features and services. Provide technical support and guidance on infrastructure-related issues.
Software Engineering for Operations: Develop and maintain internal tools and services that enhance the efficiency and reliability of our operations.
On-Call Rotation: Participate in an on-call rotation to address production issues and collaborate in incident response efforts.

Qualifications :

Experience: 3 years of experience in SRE DevOps or a related role.
Cloud Platform Experience: Proficient with cloud platforms such as AWS GCP or Azure. Experience with EC2 RDS VPCs and security groups is essential.
Kubernetes and Containers: Strong experience with Kubernetes and Docker including deployment scaling and management of containerized applications.
Infrastructure as Code: Expert in using Terraform for infrastructure as code. Proficient with configuration management tools such as Ansible Puppet or Chef.
Monitoring and Observability: Extensive experience with monitoring and observability tools like Datadog Prometheus Grafana ELK stack or Splunk. Skilled in setting up detailed monitoring and logging systems.
SLOs and SLA Management: Proven ability to define monitor and maintain SLOs and SLAs to ensure reliable service delivery.
Scripting and Automation: Strong skills in scripting languages like Python Bash or Go. Experience automating repetitive tasks and processes.
CI/CD Practices: Familiarity with GitLab CI or similar tool for continuous integration and deployment. Experience in setting up and managing pipelines.
Production Environments: Experience supporting production environments running Go or Ruby/Rails applications.
Tool Development: Ability to write and update tools to support infrastructure and application management demonstrating the principle that SRE is what happens when you ask a software engineer to design an operations team.
DevOps Best Practices: Deep understanding of DevOps principles practices and tools to drive continuous improvement in the software development lifecycle.
Soft Skills: Strong organizational skills attention to detail and the ability to work collaboratively in a team environment. Excellent documentation skills to ensure accurate and detailed records.
Problem-Solving Ability: Excellent analytical and problem-solving skills to diagnose and resolve complex system issues quickly and effectively.

Additional Information :

Benefits

Base salary per year (paid semi-monthly)
Fast- paced and professional work culture
Stock options with standard startup vesting - 1 year cliff; 4 years total
$50 monthly communication expense stipend to go towards your phone/internet bill
$250 stipend to enhance your WFH setup
Reimbursement for peripheral equipment: monitor (up to $400) keyboard and mouse (up to $200)
Premium medical benefits including vision and dental (100% coverage for employees)
Company-sponsored life and disability insurance
Paid parental bonding leave
Paid sick leave jury duty bereavement
401k plan
Flexible Time Off (our team members typically take off 3-4 weeks per year)
Volunteer Time Off
13 scheduled holidays
4-6x / year in-person team meet-ups

Salary Range: $175000 - $195000

PayNearMe strives to create a workplace where all employees thrive. Our core values represent who we are today and we take pride in the way we work with each other as well as with our stakeholders.

Were in this together to do the right thing. We deliver real results we are proud of while remaining respectful transparent and flexible.

PayNearMe is an equal opportunity employer. We are diligently and thoughtfully working towards cultivating a diverse workforce which in turn enhances our products and services for the communities we serve. Applicants who represent all backgrounds are strongly encouraged to apply.

Candidate information will be treated in accordance with our job applicant privacy notice found at: for Disabled Applicants

Alternative formats of this Notice are available to individuals with a disability. Please let us know if you need assistance.

All your information will be kept confidential according to EEO guidelines.

Remote Work :

Yes

Employment Type :

Full-time

Employment Type

Remote

Company Industry

Key Skills

Apply Now

About Company

PayNearMe

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Site Reliability Engineer - Remote

PayNearMe

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Reliability Engineer

Senior Site Reliability Engineer (m/w/d)

Senior Site Reliability Engineer - Python, Azure and Linux

Reliability Engineer I - Temporary

Service Reliability Engineer III

Site Engineer-Inland Rail

Asset Reliability Specialist

Regional Reliability Project Manager