About the role
Were looking for an engineer who thrives on building scalable data platforms and enjoys tackling complex backend challenges. This isnt just a data engineering role youll be designing and optimizing the data platform that powers our API managing everything from data streaming and storage to analytics features at petabyte scale.
You should be comfortable navigating both data and backend engineering with a solid foundation in software development. Youll work with advanced data architectures including Iceberg Flink and Kafka tackling large-scale challenges and contributing to core product development using Java and Python. If youre excited by the opportunity to shape a high-impact platform and tackle diverse engineering problems wed love to hear from you.
What you will do
Own projects aimed at enhancing data replication storage enrichment and reporting capabilities.
Build and optimize efficient streaming and batch data pipelines that support our core product and API.
Design scalable storage solutions for handling petabytes of IoT and time-series data.
Develop and maintain real-time data systems to ingest growing data volumes.
Implement distributed tracing data lineage and observability patterns to improve monitoring and troubleshooting.
Write clean maintainable code in Java and Python for various platform components.
Shape architectural decisions to ensure scalability and reliability throughout the data platform.
The ideal candidate will have
3 years of experience in platform engineering or data engineering.
2 years of experience designing and optimizing data pipelines at TB to PB scale.
Proficient in Java with a focus on clean maintainable code.
Strong system design skills with a focus on big data and real-time workflows.
Familiarity with lake-house architectures (e.g. Iceberg Delta Paimon).
Experience with real-time data processing tools like Kafka Flink and Spark.
Knowledge of distributed systems and large-scale data challenges.
Strong problem-solving skills and a collaborative mindset.
Note:You dont need to check every box. Java experience is strongly preferred but if youre confident you can pick it up quickly were open to that. Same goes for other competencies on this list.
Nice to haves
Experience working with orchestration / workflow engines (e.g. Step Functions Temporal)
Experience with serverless and/or event-driven architectures (e.g. AWS Lambda SQS).
Experience with Javascript/Typescript languages (for cross team work)
Tech stack
Languages: Java Python
Framework: Springboot
Storage: AWS S3 AWS DynamoDB Apache Iceberg Redis
Streaming: AWS Kinesis Apache Kafka Apache Flink
ETL: AWS Glue Apache Spark
Serverless: AWS SQS AWS EventBridge AWS Lambda and Step Functions.
Infrastructure as Code: AWS CDK
CI/CD: GitHub Actions
Required Experience:
Senior IC
About the roleWere looking for an engineer who thrives on building scalable data platforms and enjoys tackling complex backend challenges. This isnt just a data engineering role youll be designing and optimizing the data platform that powers our API managing everything from data streaming and storag...
About the role
Were looking for an engineer who thrives on building scalable data platforms and enjoys tackling complex backend challenges. This isnt just a data engineering role youll be designing and optimizing the data platform that powers our API managing everything from data streaming and storage to analytics features at petabyte scale.
You should be comfortable navigating both data and backend engineering with a solid foundation in software development. Youll work with advanced data architectures including Iceberg Flink and Kafka tackling large-scale challenges and contributing to core product development using Java and Python. If youre excited by the opportunity to shape a high-impact platform and tackle diverse engineering problems wed love to hear from you.
What you will do
Own projects aimed at enhancing data replication storage enrichment and reporting capabilities.
Build and optimize efficient streaming and batch data pipelines that support our core product and API.
Design scalable storage solutions for handling petabytes of IoT and time-series data.
Develop and maintain real-time data systems to ingest growing data volumes.
Implement distributed tracing data lineage and observability patterns to improve monitoring and troubleshooting.
Write clean maintainable code in Java and Python for various platform components.
Shape architectural decisions to ensure scalability and reliability throughout the data platform.
The ideal candidate will have
3 years of experience in platform engineering or data engineering.
2 years of experience designing and optimizing data pipelines at TB to PB scale.
Proficient in Java with a focus on clean maintainable code.
Strong system design skills with a focus on big data and real-time workflows.
Familiarity with lake-house architectures (e.g. Iceberg Delta Paimon).
Experience with real-time data processing tools like Kafka Flink and Spark.
Knowledge of distributed systems and large-scale data challenges.
Strong problem-solving skills and a collaborative mindset.
Note:You dont need to check every box. Java experience is strongly preferred but if youre confident you can pick it up quickly were open to that. Same goes for other competencies on this list.
Nice to haves
Experience working with orchestration / workflow engines (e.g. Step Functions Temporal)
Experience with serverless and/or event-driven architectures (e.g. AWS Lambda SQS).
Experience with Javascript/Typescript languages (for cross team work)
Tech stack
Languages: Java Python
Framework: Springboot
Storage: AWS S3 AWS DynamoDB Apache Iceberg Redis
Streaming: AWS Kinesis Apache Kafka Apache Flink
ETL: AWS Glue Apache Spark
Serverless: AWS SQS AWS EventBridge AWS Lambda and Step Functions.
Infrastructure as Code: AWS CDK
CI/CD: GitHub Actions
Required Experience:
Senior IC
View more
View less