Data Pipeline Development
- Design build and maintain ETL/ELT pipelines in Databricks to ingest clean and transform data from diverse product sources.
- Construct gold layer tables in the Lakehouse architecture that serve both machine learning model training and real-time APIs.
- Monitor data quality lineage and reliability using Databricks best practices.
AI-Driven Data Access Enablement
- Collaborate with AI/ML teams to ensure data is modeled and structured to support natural language prompts and semantic retrieval using 1st and 3rd party data sources vector search and Unity Catalog metadata.
- Help build data interfaces and agent tools to interact with structured data and AI agents to retrieve and analyze customer data with role-based permissions.
API & Serverless Backend Integration
- Work with backend engineers to design and implement serverless APIs (e.g. via AWS Lambda with TypeScript) that expose gold tables to frontend applications.
- Ensure APIs are performant scalable and designed with data security and compliance in mind.
- Utilize Databricks and other APIs to implement provisioning deployment security and monitoring frameworks for scaling up data pipelines AI endpoints and security models for multi-tenancy.
Qualifications :
- 3 years of experience as a Data Engineer or related role in an agile distributed team environment with a quantifiable impact on business or technology outcomes.
- Proven expertise with Databricks including job and workflow orchestration change data capture and medallion architecture.
- Proficiency in Spark or Scala for data wrangling and transformation on a wide variety of data sources and structures.
- Practitioner of CI/CD best practices test-driven development and familiarity with the MLOps / AIOps lifecycles.
- Proven ability to work in an agile environment with product managers front-end engineers and data scientists.
Additional Information :
Preferred Skills
- Familiarity with AWS Lambda ( preferred) and API Gateway or equivalent serverless platforms knowledge of API design principles and working with RESTful or GraphQL endpoints.
- Exposure to React-based frontend architecture and the implications of backend data delivery on UI/UX performance including end-to-end telemetry to measure performance and accuracy for the end-user experience.
- Experience with A/B testing experiment and inference logging and analytics.
Remote Work :
Yes
Employment Type :
Full-time