Candidates must currently live within 50 miles of Washington DC. (Client will require proof in the form of an ID utility bill lease or mortgage)
Qualifications:
At least ten years of experience in AI Data Science Software Engineering experience including knowledge of Data ecosystem.
Bachelors degree in Computer Science Information Systems or other related field is required or related work experience.
Data Modeling: Expertise in designing and implementing data models optimized for storage retrieval and analytics within Databricks on AWS including conceptual logical and physical data modeling.
Databricks Proficiency: In-depth knowledge and hands-on experience with AWS Databricks platform including Databricks SQL Runtime clusters notebooks and integrations.
ELT (Extract Load Transform) Processes: Proficiency in developing ETL pipelines to extract data from various sources transform it as per business requirements and load it into the central data lake using Databricks tools and Spark.
Data Integration: Experience integrating data from heterogeneous sources (relational databases APIs files) into Databricks while ensuring data quality consistency and lineage.
Performance Optimization: Ability to optimize data processing workflows and SQL queries in Databricks for performance scalability and cost- effectiveness leveraging partitioning clustering caching and Spark optimization techniques.
Data Governance and Security: Understanding of data governance principles and implementing security measures to ensure data integrity confidentiality and compliance within the centralized data lake environment.
Advanced SQL and Spark Skills: Proficiency in writing complex SQL queries and Spark code (Scala/Python) for data manipulation transformation aggregation and analysis tasks within Databricks notebooks.
Cloud Architecture: Understanding of cloud computing principles AWS architecture and services for designing scalable and resilient data solutions.
Data Visualization: Basic knowledge of data visualization tools (e.g. Tableau) to create insightful visualizations and dashboards for data analysis and reporting purposes.
Familiarity with government cloud deployment regulations/compliance policies such as FedRAMP FISMA etc.
Capabilities:
Leverage financial industry expertise to define conceptual logical and physical data models in Databricks to support new and existing business domains.
Work with product owners system architects data engineers and vendors to create data models optimized for query performance compute and storage costs.
Define best practices for the implementation of the Bronze/Silver/Gold data layers of the lakehouse.
Provide data model documentation and artifacts generated from data data dictionary data definitions etc.
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.