Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailResponsibilities include but are not limited to the following:
Optimize Spark clusters for cost efficiency and performance by implementing robust monitoring systems to identify bottlenecks using data and metrics. Provide actionable recommendations for continuous improvement
Optimize the infrastructure required for extraction transformation and loading of data from a wide variety of data sources using SQL and AWS big data technologies
Work with data and analytics experts to strive for greater cost efficiencies in our data systems
Who you are:
Experience with processing large workloads and complex code on Spark clusters
Proven experience in setting up monitoring for Spark clusters and driving optimization based on insights and findings
Experience in designing and implementing scalable Data Warehouse solutions to support analytical and reporting needs
Strong analytic skills related to working with unstructured datasets
Build processes supporting data transformation data structures metadata dependency and workload management
Working knowledge of message queuing stream processing and highly scalable big data data stores
Experience using the following software/tools:
Expertise with Python and Jupyter notebooks is a MUST.
Experience with big data tools: Spark Kafka etc.
Experience with relational SQL and NoSQL databases including Postgres and Cassandra
Experience with data pipeline and workflow management tools: Azkaban Luigi Airflow etc.
Experience with AWS cloud services: EC2 EMR RDS Redshift
Working knowledge of streamprocessing systems: Storm SparkStreaming etc. is a plus
Full Time