Databricks Consultant

Datavail Infotech

Not Interested
Bookmark
Report This Job

profile Job Location:

Mumbai - India

profile Monthly Salary: Not Disclosed
Posted on: 2 days ago
Vacancies: 1 Vacancy

Job Summary

Description

Job Title: Senior Associate Developer - Databricks PySpark and Spark SQL

Education: Any Graduate

Experience: 5years

Location: Mumbai

Key Skills:

  • Strong hands-on experience with Databricks PySpark and Spark SQL.

  • Expertise in Delta Lake BronzeSilverGold architecture and Lakehouse patterns.

  • Strong experience with cloud platforms (AWS/Azure/GCP).

  • Solid understanding of data warehousing dimensional modeling and bigdata concepts.

Job Description:

  • Build scalable ETL/ELT pipelines using Databricks (PySpark SQL Spark Streaming).

  • Develop and optimize Delta Lake tables ACID transactions schema evolution and time travel.

  • Implement Unity Catalog data governance and access cluster configurations job workflows and performance tuning in Databricks.

  • Design and implement batch and streaming pipelines using Spark Structured Streaming.

  • Integrate Databricks with multiple data sources (RDBMS APIs cloud storage message queues).Develop reusable modular and automated data processing frameworks.

  • Implement CI/CD pipelines for Databricks using GitHub Actions / Azure DevOps / cluster management and job orchestration using Databricks REST APIs.

  • Maintain code quality unit tests and documentation.

  • Write and optimize complex SQL queries and statements to ensure high performance and efficient data retrieval.

  • Strong database design including normalization data modelling and relational schema creation.

  • Conduct performance analysis troubleshoot database issues like slow queries or deadlocks and implement solutions

  • Design and implement database structures including tables schemas views stored procedures functions and triggers.

  • Optimize database performance through query tuning indexing and performance analysis.

  • Ensure data integrity security and compliance standards

  • Need strong Python skills combined with expertise in Apache Spark for large scale data processing. Core abilities include building efficient ETL pipelines optimizing distributed jobs and handling large-scale data transformations

  • Expertise in Python programming Spark APIs and parallel processing.

  • Proficiency in Python (including Pandas NumPy) for data manipulation and scripting

  • Deep knowledge of PySpark APIs like DataFrames RDDs Spark SQL for querying and processing.

  • Familiarity with RESTful APIs batch processing CI/CD and monitoring data jobs.

  • Optimize Spark jobs for performance troubleshoot issues and ensure data quality across systems.

  • Collaborate with data engineers and scientists to implement workflows conduct code reviews and integrate with cloud platforms like AWS or Azure.

  • Design develop and maintain scalable data pipelines and ETL processes using Azure Databricks

  • Build data transformation workflows using Python or Scala.

  • Work with data lakes using Delta Lake.

  • Integrate data from multiple sources such as APIs databases and cloud storage.

  • Monitor and optimize data workflows for performance and reliability.

  • Collaborate with data scientists analysts and business teams.




Required Experience:

Contract

DescriptionJob Title: Senior Associate Developer - Databricks PySpark and Spark SQLEducation: Any GraduateExperience: 5yearsLocation: MumbaiKey Skills:Strong hands-on experience with Databricks PySpark and Spark SQL.Expertise in Delta Lake BronzeSilverGold architecture and Lakehouse patterns.Strong ...
View more view more

About Company

Company Logo

Datavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leadi ... View more

View Profile View Profile