Senior Data Analyst
New York City, NY - USA
Job Summary
Job Summary:
- Location Requirement: Applicants must be local and able to attend a final in-person interview.
- Interview Process: Includes a live SQL coding and PySpark problem-solving session.
- Certifications:
- Databricks Data Engineer certification (most important; can be completed after onboarding).
- Google Cloud certification (acceptable post-hire).
- SQL Skills:
- Ability to independently write and execute SQL queries.
- Proficient in data extraction and basic to intermediate data transformations.
- PySpark Expertise:
- Strong hands-on experience required.
- Core responsibility: building data pipelines and performing large-scale data processing.
- Python/Pandas Skills:
- Required for data cleaning transformation and analysis.
- Machine Learning:
- 20% of the role.
- Must have hands-on experience with ML (classification regression clustering model evaluation).
- No ML training will be provided.
- Data Analysis Focus:
- Majority of work involves SQL-based data extraction and PySpark/Pandas-driven analysis and insights.
- Workflow Integration:
- Should understand end-to-end workflows that combine SQL PySpark and ML.
- Location Requirement: Applicants must be local and able to attend a final in-person interview.
- Interview Process: Includes a live SQL coding and PySpark problem-solving session.
- Certifications:
- Databricks Data Engineer certification (most important; can be completed after onboarding).
- Google Cloud certification (acceptable post-hire).
- SQL Skills:
- Ability to independently write and execute SQL queries.
- Proficient in data extraction and basic to intermediate data transformations.
- PySpark Expertise:
- Strong hands-on experience required.
- Core responsibility: building data pipelines and performing large-scale data processing.
- Python/Pandas Skills:
- Required for data cleaning transformation and analysis.
- Machine Learning:
- 20% of the role.
- Must have hands-on experience with ML (classification regression clustering model evaluation).
- No ML training will be provided.
- Data Analysis Focus:
- Majority of work involves SQL-based data extraction and PySpark/Pandas-driven analysis and insights.
- Workflow Integration:
- Should understand end-to-end workflows that combine SQL PySpark and ML.