A leading B2B SaaS platform in the cross-border e-commerce sector is expanding its North America operations. Were seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and maintain our unified global O&M (operations and maintenance) platform.
This is a newly created role supporting our North America teams contribution. Youll work directly with our Middle Platform Director Technical Experts and CEO in a collaborative remote-first environment.
KEY RESPONSIBILITIES:
Design develop and maintain unified operation and platform management systems covering resource management monitoring & alerting configuration management and automated operation & maintenance
Build and operate observability platforms and CI/CD pipelines; develop self-healing systems and automated incident response processes to realize intelligent O&M
Establish DevOps standards and best practices; promote standardization of DevOps toolchains (technology selection version management)
Provide platform-level technical support for product and engineering teams; resolve complex system issues reduce technical debt and lead infrastructure and architecture upgrades
Promote SRE concepts and engineering practices; organize technical sharing and training; build a reliability engineering system
Conduct technical research and innovation; track cloud-native/DevOps industry trends; evaluate new technologies and drive continuous modernization of O&M platforms
REQUIRED QUALIFICATIONS:
Currently residing in California or North Carolina USA
US Green Card or US Citizenship (work authorization; no sponsorship available)
Fluent in Mandarin Chinese (working language; close collaboration with domestic R&D required)
Bachelors degree or above in Computer Science or related field
4-6 years of hands-on experience in DevOps/SRE/Platform Engineering
Proficient in at least one major cloud platform (AWS/Azure/GCP) with deep understanding of VPC EC2 EKS/K8s RDS IAM
Proficient in Linux networking containers (Docker/Kubernetes) load balancing and service governance
Skilled in IaC (Infrastructure as Code) tools: Terraform Ansible Helm
Experience building CI/CD pipelines: Jenkins Argo CD CodeBuild etc.
Familiar with monitoring/logging/tracing: Prometheus Grafana ELK OpenTelemetry
Proficient in at least one development/scripting language: Python Shell Go
Excellent system design analysis and troubleshooting skills
Strong cross-team communication and collaboration abilities
PREFERRED QUALIFICATIONS:
Masters degree in Computer Science or related field
Experience with global platforms cross-border SRE multi-cloud O&M
Led platform reconstruction self-healing systems or observability initiatives
Go development service mesh chaos engineering capacity planning experience
Demonstrated success improving system availability reducing incident rates increasing automation
Global technical vision and cross-cultural collaboration experience
Result-oriented self-driven experienced in technical evangelism/sharing
COMPENSATION:
Base Salary: $140000 - $160000 annually (top candidates may receive 5-10% upward adjustment)
401(k): Dollar-for-dollar match up to 4% of salary
Medical Insurance
PTO: 12 days annually
Social Security & Housing Fund: Contributed per US legal requirements
WORK ENVIRONMENT:
Location: Silicon Valley CA OR Raleigh NC (homebase available)
Department: Tech O&M Department
Working Style: Remote-first
Hours: 8 hours per day weekends off
Travel: No business travel required
Expected Start: ASAP
Interview Process: Round 1 (Online): Middle Platform Director Technical Expert Round 2 (Online): Head of HR Round 3 (Online): CEO/Founder
A leading B2B SaaS platform in the cross-border e-commerce sector is expanding its North America operations. Were seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and maintain our unified global O&M (operations and maintenance) platform. This is a newly created role s...
A leading B2B SaaS platform in the cross-border e-commerce sector is expanding its North America operations. Were seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and maintain our unified global O&M (operations and maintenance) platform.
This is a newly created role supporting our North America teams contribution. Youll work directly with our Middle Platform Director Technical Experts and CEO in a collaborative remote-first environment.
KEY RESPONSIBILITIES:
Design develop and maintain unified operation and platform management systems covering resource management monitoring & alerting configuration management and automated operation & maintenance
Build and operate observability platforms and CI/CD pipelines; develop self-healing systems and automated incident response processes to realize intelligent O&M
Establish DevOps standards and best practices; promote standardization of DevOps toolchains (technology selection version management)
Provide platform-level technical support for product and engineering teams; resolve complex system issues reduce technical debt and lead infrastructure and architecture upgrades
Promote SRE concepts and engineering practices; organize technical sharing and training; build a reliability engineering system
Conduct technical research and innovation; track cloud-native/DevOps industry trends; evaluate new technologies and drive continuous modernization of O&M platforms
REQUIRED QUALIFICATIONS:
Currently residing in California or North Carolina USA
US Green Card or US Citizenship (work authorization; no sponsorship available)
Fluent in Mandarin Chinese (working language; close collaboration with domestic R&D required)
Bachelors degree or above in Computer Science or related field
4-6 years of hands-on experience in DevOps/SRE/Platform Engineering
Proficient in at least one major cloud platform (AWS/Azure/GCP) with deep understanding of VPC EC2 EKS/K8s RDS IAM
Proficient in Linux networking containers (Docker/Kubernetes) load balancing and service governance
Skilled in IaC (Infrastructure as Code) tools: Terraform Ansible Helm
Experience building CI/CD pipelines: Jenkins Argo CD CodeBuild etc.
Familiar with monitoring/logging/tracing: Prometheus Grafana ELK OpenTelemetry
Proficient in at least one development/scripting language: Python Shell Go
Excellent system design analysis and troubleshooting skills
Strong cross-team communication and collaboration abilities
PREFERRED QUALIFICATIONS:
Masters degree in Computer Science or related field
Experience with global platforms cross-border SRE multi-cloud O&M
Led platform reconstruction self-healing systems or observability initiatives
Go development service mesh chaos engineering capacity planning experience
Demonstrated success improving system availability reducing incident rates increasing automation
Global technical vision and cross-cultural collaboration experience
Result-oriented self-driven experienced in technical evangelism/sharing
COMPENSATION:
Base Salary: $140000 - $160000 annually (top candidates may receive 5-10% upward adjustment)
401(k): Dollar-for-dollar match up to 4% of salary
Medical Insurance
PTO: 12 days annually
Social Security & Housing Fund: Contributed per US legal requirements
WORK ENVIRONMENT:
Location: Silicon Valley CA OR Raleigh NC (homebase available)
Department: Tech O&M Department
Working Style: Remote-first
Hours: 8 hours per day weekends off
Travel: No business travel required
Expected Start: ASAP
Interview Process: Round 1 (Online): Middle Platform Director Technical Expert Round 2 (Online): Head of HR Round 3 (Online): CEO/Founder
View more
View less