Site Reliability Engineer- Product

Not Interested
Bookmark
Report This Job

profile Job Location:

Irvine, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 13 hours ago
Vacancies: 1 Vacancy

Job Summary

MatchPoint Solutions is a fast-growing young energetic global IT-Engineering services company with clients across the US. We provide technology solutions to various clients like Uber Robinhood Netflix Airbnb Google Sephora and more! More recently we have expanded to working internationally in Canada China Ireland UK Brazil and India. Through our culture of innovation we inspire build and deliver business results from idea to outcome. We keep our clients on the cutting edge of the latest technologies and provide solutions by using industry-specific best practices and expertise.

We are excited to be continuously expanding our team. If you are interested in this position please send over your updated resume. We look forward to hearing from you!

Title : Site Reliability Engineer- Product

Duration : 3 to 6 Months contract to hire
Location : On site 5 days a week in Irvine CA
Rate : $60 to $65/hr w2
Role Overview :
As a Site Reliability Engineer on the Product Platform team you will design build and operate shared cloud-native platform services that enable product teams to deliver reliable scalable and secure applications.
You will partner closely with product and engineering teams to improve platform reliability automate operations and provide resilient infrastructure tooling and standards that support all product teams. Responsibilities Product Platform & Cloud Engineering
Design and operate shared multi-cloud platform infrastructure (AWS GCP Azure) supporting enterprise healthcare products.
Build and maintain Kubernetes-based platform services including multi-cluster environments and service mesh (Istio) to enable scalable and resilient application delivery.
Implement Infrastructure-as-Code (Terraform Helm CloudFormation) to provide consistent repeatable platform environments.
Design high-availability cross-region disaster recovery and deployment architectures for mission-critical product services.
Align platform capabilities with product needs optimizing for reliability scalability and cost efficiency.
Platform Automation & Enablement
  • Develop and maintain CI/CD and GitOps workflows (Bitbucket Pipelines ArgoCD) as shared platform capabilities.
  • Automate infrastructure application configuration and database deployments (Ansible Liquibase).
  • Build automated health checks self-healing and zero-downtime deployment mechanisms for platform services.
  • Provide technical guidance and best practices to product engineering teams using the platform.
  • Reliability Security & Observability
  • Design and operate platform-wide monitoring and observability solutions using Prometheus Grafana and OpenTelemetry.
  • Build dashboards and standardized alerting for all platform and product services.
  • Enforce platform security standards including container scanning secrets management RBAC and secure network policies.
  • Ensure platform compliance with HIPAA SOC 2 and ISO 27001 through automated controls and secure communication.
  • Reduce operational toil through automation cost visibility and continuous reliability improvements.
Nice to Have
  • Experience with event streaming platforms and data services (Kafka/MSK Flink Debezium).
  • Exposure to MLOps or AI platform infrastructure (MLflow Kubeflow GenAI/RAG workloads).
  • Familiarity with FinOps and cost optimization for shared platforms.
Qualifications Required
  • Bachelors degree in Computer Science or related field and 4 years of relevant experience (or equivalent).
  • Experience operating production-grade cloud infrastructure and Kubernetes platforms.
  • Strong Infrastructure-as-Code CI/CD GitOps and observability experience.
  • Proven troubleshooting skills across distributed systems and platform reliability issues.
  • Proficiency in at least one scripting or programming language (Python Go Bash).
  • Experience with on-call rotations incident response and root cause analysis.
Preferred
  • Experience building internal platforms or shared developer services.
  • Multi-cloud experience and service mesh familiarity (Istio).
  • Experience in regulated environments and cloud-native security practices.
MatchPoint Solutions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race color religion age sex national origin disability status genetics protected veteran status sexual orientation gender identity or expression or any other characteristic protected by federal state or local laws.

This policy applies to all terms and conditions of employment including recruiting hiring placement promotion termination layoff recall transfer leaves of absence compensation and training.

MatchPoint Solutions is a fast-growing young energetic global IT-Engineering services company with clients across the US. We provide technology solutions to various clients like Uber Robinhood Netflix Airbnb Google Sephora and more! More recently we have expanded to working internationally in Canada...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting