To be the dominant digital-first wealth and asset management partner for the underserved African middle class and fast-growing African businesses. We empower over 3 million customers to build a savings and investment culture across different asset classes. Our customer base continues to expand and we are committed to ensuring that every interaction with our platform provides the best experience possible.
The role
Were looking for a Site Reliability Engineer (SRE) to help build maintain and scale the infrastructure powering work closely with our engineering team to improve reliability observability security and deployment processes across our systems. Our infrastructure team specializes across four areas: Cloud Databases Platform and Observability. We run primarily on AWS with some workloads on GCP. For this role were particularly interested in someone who can raise the bar on observability helping us detect issues faster and resolve them with confidence.
What youll do
Generally members of the infrastructure team are able to do the following
Design maintain and improve cloud infrastructure and internal platforms
Improve system reliability scalability and performance across services
Build and maintain CI/CD pipelines and deployment workflows
Implement monitoring logging alerting and observability systems
Respond to incidents troubleshoot production issues and lead root cause analysis
Automate operational tasks and infrastructure provisioning
Work with engineering teams to improve service architecture and operational readiness
Improve security posture access controls and infrastructure best practices
Manage containerized workloads and orchestration platforms
Maintain disaster recovery backup and high availability strategies
What were looking for
Required
4 years of experience in an SRE DevOps or Platform Engineering role running production systems
Strong hands-on experience with AWS (compute networking IAM storage managed services)
Deep expertise in observability designing meaningful metrics dashboards alerts and SLOs that actually catch problems before users do
Hands-on experience with New Relic Grafana and Prometheus (or equivalent tooling)
A track record of reducing MTTD and MTTR through better instrumentation alerting and incident response practices
Proficiency with Docker and containerized workflows
Solid scripting and automation skills (Python Bash Go or similar)
Experience with infrastructure-as-code (Terraform Pulumi or CloudFormation)
Strong Linux fundamentals and networking knowledge
Experience building and maintaining CI/CD pipelines
Comfort leading incident response and writing clear post-mortems
Nice to have
Experience operating Kubernetes in production
Exposure to GCP or multi-cloud environments
Background in one of our specialization areas: Databases (Postgres MySQL Redis) Platform engineering or Cloud architecture
Experience in fintech or other regulated high-availability environments
The people who succeed on this team
People who are proactive and take ownership
Engineers who automate before repeating manual work
People who stay calm and methodical during incidents
Engineers who care about clean systems and operational excellence
Strong collaborators who work well across teams
Curious builders who enjoy learning and improving systems continuously
Required Experience:
IC
Our goalTo be the dominant digital-first wealth and asset management partner for the underserved African middle class and fast-growing African businesses. We empower over 3 million customers to build a savings and investment culture across different asset classes. Our customer base continues to expa...
Our goal
To be the dominant digital-first wealth and asset management partner for the underserved African middle class and fast-growing African businesses. We empower over 3 million customers to build a savings and investment culture across different asset classes. Our customer base continues to expand and we are committed to ensuring that every interaction with our platform provides the best experience possible.
The role
Were looking for a Site Reliability Engineer (SRE) to help build maintain and scale the infrastructure powering work closely with our engineering team to improve reliability observability security and deployment processes across our systems. Our infrastructure team specializes across four areas: Cloud Databases Platform and Observability. We run primarily on AWS with some workloads on GCP. For this role were particularly interested in someone who can raise the bar on observability helping us detect issues faster and resolve them with confidence.
What youll do
Generally members of the infrastructure team are able to do the following
Design maintain and improve cloud infrastructure and internal platforms
Improve system reliability scalability and performance across services
Build and maintain CI/CD pipelines and deployment workflows
Implement monitoring logging alerting and observability systems
Respond to incidents troubleshoot production issues and lead root cause analysis
Automate operational tasks and infrastructure provisioning
Work with engineering teams to improve service architecture and operational readiness
Improve security posture access controls and infrastructure best practices
Manage containerized workloads and orchestration platforms
Maintain disaster recovery backup and high availability strategies
What were looking for
Required
4 years of experience in an SRE DevOps or Platform Engineering role running production systems
Strong hands-on experience with AWS (compute networking IAM storage managed services)
Deep expertise in observability designing meaningful metrics dashboards alerts and SLOs that actually catch problems before users do
Hands-on experience with New Relic Grafana and Prometheus (or equivalent tooling)
A track record of reducing MTTD and MTTR through better instrumentation alerting and incident response practices
Proficiency with Docker and containerized workflows
Solid scripting and automation skills (Python Bash Go or similar)
Experience with infrastructure-as-code (Terraform Pulumi or CloudFormation)
Strong Linux fundamentals and networking knowledge
Experience building and maintaining CI/CD pipelines
Comfort leading incident response and writing clear post-mortems
Nice to have
Experience operating Kubernetes in production
Exposure to GCP or multi-cloud environments
Background in one of our specialization areas: Databases (Postgres MySQL Redis) Platform engineering or Cloud architecture