About the role:
We are looking for a Data Platform Engineer to enhance and expand Stabilitys data platform using AWS and GCP services. This begins with understanding our needs selecting the appropriate tech stack and tools gaining buyin from stakeholders and choosing the best longterm solution. This role will act as the technical thought leader for our data platform and help hire and mentor other team members.
Responsibilities:
- Architect implement and monitor data pipelines as well as conduct knowledgesharing sessions with peers to ensure effective data use
- Work with team members and leaders to further develop our data strategy
- Develop and maintain a data catalog so that we have a full understanding of the data and events within our platform
- Maintain infrastructureascode (AWS CDK) provisioning spin up new and/or revised data platform resources and environments to enhance data collection and processing from our applications and data sources
- Act as the primary technical decisionmaker for the data platform and help hire and support analytical data personnel (e.g. Data Scientist Data Analyst BI Developer etc)
- Analyze large amounts of data both structured and unstructured and create dashboards queries and visualizations
- Technical experience in the following areas (with examples):
- Containerized compute: Kubernetes (EKS GKS) ECS Docker
- Data ingestion: Airbyte Mage AI AWS Glue
- Pipeline orchestration: Airflow Spark
- Data manipulation: Pandas Numpy Polars
- Data lake: S3 Iceberg Delta Lake
- Data warehouse: BigQuery Redshift
- Database: PostgreSQL
- Data visualization: Looker Superset/Preset
- Data quality (optional): Great Expectations Monte Carlo
- Data catalog (optional): Amundsen AWS Glue Catalog
- Event streaming (optional): Kafka Flink Redis Pub/Sub
- Infrastructureascode: AWS CDK Terraform
- CI/CD: GitHub Actions
Qualifications:
- Must have 7 years of handson experience working with data platforms
- Experience being involved in the design and implementation of at least one data platform.
- Experience in streaming and/or batch analytics.
- Experience with scalable distributed data processing management and visualization tools.
- Experience using business intelligence tools and data frameworks.
- Strong working knowledge of Python for both provisioning cloud infrastructure and working with data.
- Were a remotefirst company and we need people who can do their best work from home collaborate effectively work asynchronously and are proactive about reaching out to peers to get things done and/or get themselves unstuck.
- Enthusiasm a can do attitude and optimism when faced with difficult challenges.
Required Experience:
Senior IC