Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailNot Disclosed
Salary Not Disclosed
1 Vacancy
Were OneText a YC backed (Winter 23) startup in the Bay Area and were looking for a Devops/DBA Engineer
Were growing faster than we can manage! Since raising our seed round weve:
Come up with more ideas for features than we could build in a lifetime and shipped a ton of them anyway
Solved our fair share of scaling issues allowing us to process tens of millions of webhooks and outbound messages per hour
Built up some huge revenue streams from our dedicated customers who want even more of what we have to offer helping us gear up towards our targets for raising our Series A
Had a lot of fun integrating AI into every part of our product
So: join us if you like the idea of a startup environment that is fast paced but in a sustainable way. There are no shortage of fun engineering challenges and new things to learn. But we always want to be deliberate and smart about what we decide to build and not just race from one thing to the next.
But for this role specifically heres what were looking for:
We use a mixture of:
Postgres: our core database used for everything that we want to stick around forever and be highly durable/consistent. Accounts payments billing configuration and so on.
Mongo: Used for tracking and scheduled tasks. Also planning to migrate to mongo for very writeheavy tables like messages fees events etc.
Redis: Used for caching and data that only needs a finite ttl like temporary shortlinks
AWS SQS & Event Bridge: Used for scheduling and queuing lambdas for highthroughput tasks like campaigns.
Clickhouse: Used for analytics
We need to you to be experienced in:
Writing really optimal and wellformed queries
Picking the right database type for the right job
Scaling all of these vertically or horizontally
Building in indexes partitions etc.
Warehouseing older data for larger tables
Monitoring the performance and health of these databases
Right now we primarily use DigitalOcean but we have started to use AWS for some new services. We feel there is a strong case to move over completely to AWS or GCP as we scale.
We want you to have strong opinions on which cloud providers are great and which are not. And we want you to come up with a plan for the future of how OneText lives and is deployed to the perfect platform. And how we migrate to get there.
We have two kinds of traffic:
Steady traffic (usually initiated via webhook events from shopper actions on our customers stores)
Burst traffic from our customers scheduling campaigns to hundreds of thousands or millions of customers at once (especially during holiday periods like Black Friday)
Right now we have:
A frontend written in React
An API layer written in Node
A worker layer written in Node
Postgres/Mongo/Redis databases
A campaign engine written using AWS Event Bridge SQS and Lambas
We want you to help us optimize these for scale. Whether thats figuring out a good strategy to help our worker churn through as many tasks as fast as possible or migrating certain operations from one database type to another or delegating tasks from our api to our workers or anything to help us be fast when we really need to be.
Find a problem with how we manage concurrency in accepting new tasks Notice our database connections arent being pooled properly We want you to be able to jump into our app and make any fixes you need to. So some knowledge of Node/TypeScript would be very helpful but a willingness to learn and ask questions is the most important thing here.
We have two major use cases in mind here:
We want to be able to segment users based on properties or events they have attached to their account. This is mainly used to be able to schedule campaigns to the correct set of users.
We want to be able to get really good reporting for messages revenue clicks fees roi and so on for all of the sms based flows and automations we run.
Were thinking an OLAP database would be a good fit for these two problems. Weve been relying too much on our production database for analytical tasks like these. Weve started building on Clickhouse for this reason.
We would like you to be familiar enough with Clickhouse or other OLAP databases or willing to learn enough to start solving for these.
What happens when we get errors when databases go down or run out of space when our cloud provider fails to deploy our code
We want to have good contingencies and backups for all of these cases and enough redundancy that we can keep our app highly available and able to deploy at any time.
We want to go as fast as possible from merging in new code (once its tested and guaranteed to be stable) to having that code build and hit the production site with any tests run and database migrations performed and so on.
We also want to make sure we have good testing environments to give us as much confidence as possible before deploying new code.
Full-Time