Assignment: RQ00163 - Data Science Developer - Senior
Job Title: Data Science Developer Senior
Start Date:
End Date:
Office Location: 525 University Avenue. Toronto.
Account: Metrolinx
Department: Solution Delivery
# Business Days: 130.00
Location: Hybrid 2 days onsite/3 days remote
Public Sector Experience: Preferred
Must Haves:
- 5 years experience Azure environment
- 5 years experience Data engineering with ADF and Databricks
- 5 years experience Programming experience with Python SQL
Description
Responsibilities
- Participate in product teams to analyze systems requirements architect design code and implement cloud-based data and analytics products that conform to standards
- Design create and maintain cloud-based data lake and lakehouse structures automated data pipelines analytics models
- Liaises with cluster IT colleagues to implement products conduct reviews resolve operational problems and support business partners in effective use of cloud-based data and analytics products.
- Analyze complex technical issues identify alternatives and recommend solutions.
- Support the migration of legacy data pipelines from Azure Synapse Analytics and Azure Data Factory (including stored procedures views used by BI teams and Parquet files in Azure Data Lake Storage (ADLS)) to modernized Databricks-based solutions leveraging Delta Lake and native orchestration capabilities
- Support the development of standards and a reusable framework that streamlines pipeline creation
- Participate in code reviews and prepare/conduct knowledge transfer to maintain code quality promote team knowledge sharing and enforce development standards across collaborative data projects.
General Skills
- Experience in multiple cloud-based data and analytics platforms and coding/programming/scripting tools to create maintain support and operate cloud-based data and analytics products with a preference for Microsoft Azure
- Experience with designing creating and maintaining cloud-based data lake and lakehouse structures automated data pipelines analytics models in real world implementations
- Strong background in building and orchestrating data pipelines using services like Azure Data Factory and Databricks
- Demonstrated ability to organize and manage data in a lakehouse following medallion architecture;
- Background with Databricks Unity Catalog for governance is a plus
- Proficient in using Python and SQL for data engineering and analytics development
- Familiar with CI/CD practices and tools for automating deployment of data solutions and managing code lifecycle
- Comfortable conducting and participating in peer code reviews in GitHub to ensure quality consistency and best practices
- Experience in assessing client information technology needs and objectives
- Experience in problem-solving to resolve complex multi-component failures
- Experience in preparing knowledge transfer documentation and conducting knowledge transfer
- Experience working on an Agile team
Desirable Skills
- Written and oral communication skills to participate in team meetings write/edit systems documentation prepare and present written reports on findings/alternate solutions develop guidelines / best practices
- Interpersonal skills to explain and discuss advantages and disadvantages of various approaches
Technology Stack
- Azure Storage Azure Data Lake Azure Databricks Lakehouse Azure Synapse Azure Databricks
- Python SQL
- PowerBI
- GitHub