- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Extensive Experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform. Having experience on Azure Databricks is an added advantage (not mandatory).
- Experience on Cluster capacity configurations and cloud optimization to meet application demand.
- Programming experience on Python, Shell scripting, PySpark and other data programming languages.
- Programming experience on Apache Beam Java SDK for building effective heavy data pipelines and deploying them in GCP DataFlow. CICD process to deploy these pipelines in GCP.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with Data Visualization Dashboard, Metrics, etc.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
- Familiar with Deployment tools like Docker and building CI/CD pipelines.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- 8+ years' experience in software development, Data engineering, and
- Bachelor's degree in computer science, Statistics, Informatics, Information Systems or another quantitative field. Postgraduate/master's degree is preferred.
- Experience in Machine Learning and Data Modeling is a plus.
Required Skills : Data Warehouse,Python
Additional Skills : Software Developer,Data Warehouse Engineer