What Youll Do:
- Design develop and optimize ETL/ELT pipelines using Databricks (PySpark) and GCP services. Build scalable data pipelines that ingest process and transform structured and unstructured data from various sources.
- Conduct exploratory data analysis (EDA) to understand data behavior assess quality identify trends/anomalies and inform downstream modeling or reporting needs. Develop and maintain data models and schemas within BigQuery to support analytical and reporting needs.
- Work with stakeholders analysts and data scientists to understand data requirements and deliver clean curated datasets.
- Implement data quality checks monitoring logging and alerting for robust data operations.
- Write clean maintainable and well-tested code primarily in Python.
- Leverage GCP tools such as Cloud Storage Dataflow Pub/Sub and Cloud Functions to support data operations.
- Collaborate with DevOps and platform teams to ensure data infrastructure is scalable secure and cost-effective.
What You Know:
- 5 years of experience in data engineering or related roles. Strong proficiency in Python for data processing and exploration.
- Hands-on experience with Databricks Spark/PySpark and advanced SQL. Proven experience with Google Cloud Platform (GCP) services especially BigQuery.
- Experience conducting exploratory data analysis (EDA) to uncover data insights and ensure readiness for analytical use.
- Experience with orchestration tools like Airflow Cloud Composer or similar.
- Solid understanding of data modeling data warehousing concepts and performance tuning.
- Familiarity with CI/CD version control (Git) and infrastructure as code (e.g. Terraform) is a plus.
- Strong problem-solving skills and ability to work independently in a fast-paced environment.
Preferred Qualifications:
- Experience with streaming data pipelines using Kafka or Pub/Sub.
- Exposure to data governance security and privacy compliance practices.
- Knowledge of machine learning workflows is a plus.
Education:
- Bachelors or masters degree in computer science Engineering or a related field.
Compensation:
$120K-130A/Year