Job Overview:
Develop data pipelines/data frameworks/applications/apis using industry best practices. Make adjustments to adapt to new methodologies that provide the business with increased flexibility and agility.
Drive Proof of Concept (POC) and Proof of Technology(POT) evaluations.
Build data system and data quality monitoring process and ensure health of the big data system.
Review KPIs Monitoring and Alarms for successful production system.
Work with the stakeholders to understand the business & reporting needs Propose design and develop creative data solutions that provides required insight improve/automate processes and adds value.
Influence and educate the team with your experience idea and learning.
Stay current with the latest development tools technology ideas patterns and methodologies; share knowledge by clearly articulating results and ideas to key stakeholders.
Publish and update technical architecture and user/process documentation.
Job Requirements:
Bachelors degree in Computer Science or related technical field.
4 years of relevant experience on Scala/Python (PySpark) Distributed Databases Kafka with solid handson multithreading functional programing etc.
A good understanding of CS Fundamentals Data Structures Algorithms and Problem Solving.
Professional handon experience in Sql and Query Optimization.
Experience in building frameworks for data ingestions and consumptions patterns.
Experience with Orchestration tools like Airflow perfered) Automic Autosys etc.
Handon experience in Data processing and Data manipulation.
Expertise with GCP cloud and GCP data processing tools platforms and technologies like GCS DataProc DPaaS BigQuery Hive etc.
Experience with Streaming data via Kafka or structured streaming etc.
Exposure to lambda Architecture.
Exposure to visualization tools for data reporting such as Tableau PowerBI Looker etc.
Excellent communication skills for collaborating with teams.
Appreciate productivity and care deeply about helping others work more effectively and efficiently.
Mandatory Skills:
Scala/Python (Pyspark) Big Data Knowledge Scripting (Shell/Python) GCP Data Tech Stack Airflow CICD Kafka Distributed Database SQL/NoSQL
Required Experience:
Senior IC