Databricks-T3

Datavail Infotech

Not Interested
Bookmark
Report This Job

profile Job Location:

Mumbai - India

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Description

Title: Technical Specialist

Location: Mumbai

Education: Bachelors Degree

Job Description:

  • Build scalable ETL/ELT pipelines using Databricks (PySpark SQL Spark Streaming).
  • Develop and optimize Delta Lake tables ACID transactions schema evolution and time travel.
  • Implement Unity Catalog data governance and access cluster configurations job workflows and performance tuning in Databricks.
  • Design and implement batch and streaming pipelines using Spark Structured Streaming.
  • Integrate Databricks with multiple data sources (RDBMS APIs cloud storage message queues).Develop reusable modular and automated data processing frameworks.
  • Implement CI/CD pipelines for Databricks using GitHub Actions / Azure DevOps / cluster management and job orchestration using Databricks REST APIs.
  • Maintain code quality unit tests and documentation.
  • Write and optimize complex SQL queries and statements to ensure high performance and efficient data retrieval.
  • Strong database design including normalization data modelling and relational schema creation.
  • Conduct performance analysis troubleshoot database issues like slow queries or deadlocks and implement solutions
  • Design and implement database structures including tables schemas views stored procedures functions and triggers.
  • Optimize database performance through query tuning indexing and performance analysis.
  • Ensure data integrity security and compliance standards
  • Need strong Python skills combined with expertise in Apache Spark for large scale data processing. Core abilities include building efficient ETL pipelines optimizing distributed jobs and handling large-scale data transformations
  • Expertise in Python programming Spark APIs and parallel processing.
  • Proficiency in Python (including Pandas NumPy) for data manipulation and scripting
  • Deep knowledge of PySpark APIs like DataFrames RDDs Spark SQL for querying and processing.
  • Familiarity with RESTful APIs batch processing CI/CD and monitoring data jobs.
  • Optimize Spark jobs for performance troubleshoot issues and ensure data quality across systems.
  • Collaborate with data engineers and scientists to implement workflows conduct code reviews and integrate with cloud platforms like AWS or Azure.
  • Design develop and maintain scalable data pipelines and ETL processes using Azure Databricks
  • Build data transformation workflows using Python or Scala.
  • Work with data lakes using Delta Lake.
  • Integrate data from multiple sources such as APIs databases and cloud storage.
  • Monitor and optimize data workflows for performance and reliability.
  • Collaborate with data scientists analysts and business teams



DescriptionTitle: Technical SpecialistLocation: MumbaiEducation: Bachelors DegreeJob Description:Build scalable ETL/ELT pipelines using Databricks (PySpark SQL Spark Streaming).Develop and optimize Delta Lake tables ACID transactions schema evolution and time travel.Implement Unity Catalog data gove...
View more view more

About Company

Company Logo

Datavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leadi ... View more

View Profile View Profile