Core Skills & Knowledge Areas
- Python
- Data manipulation: Proficiency with libraries like pandas numpy and pyarrow.
- Scripting & automation: Writing reusable modular scripts for data ingestion and transformation.
- APIs: Consuming and building RESTful APIs for data exchange.
- Testing: Unit testing with pytest or unittest.
- Cloud Platforms
- AWS / Azure / GCP: Familiarity with services like:
- AWS: S3 Lambda Glue Redshift EMR
- Azure: Data Factory Blob Storage Synapse
- GCP: BigQuery Cloud Functions Dataflow
- Infrastructure as Code (IaC): Tools like Terraform or CloudFormation.
- Security & IAM: Managing access and permissions.
- Back-End Development
- Databases: SQL (PostgreSQL MySQL) NoSQL (MongoDB DynamoDB).
- APIs: Building data services using frameworks like Flask FastAPI or Django.
- CI/CD: Familiarity with Git Docker Jenkins or GitHub Actions.
- ETL / ELT Pipelines
- Pipeline orchestration: Tools like Apache Airflow Prefect or Luigi.
- Data transformation: Using SQL dbt or Python scripts.
- Batch vs Streaming: Understanding of Kafka Spark Streaming or Flink.
- Monitoring & Logging: Ensuring data quality and pipeline reliability.
Tools & Technologies
- Programming: Python SQL.
- Cloud: AWS Azure GCP.
- Orchestration: Airflow Prefect.
- Databases: PostgreSQL BigQuery Redshift.
- Data Lakes: S3 Azure Data Lake.
- Containers: Docker Kubernetes.
- Version Control: Git GitHub/GitLab.
Soft Skills & Other Requirements
- Problem-solving: Ability to debug and optimize data workflows.
- Teamwork: Collaborating with Data Scientists Analysts and DevOps.
Some of our perks
- Fresh fruit sometimes spoiled fruit all the time.
- You can work from anywhere including your home.
- Flexible hours.
- Team lunches Bday celebrations happy hours.
- Wellness program and company retreats.
- English lessons.
- Courses and training.