Data Scientist (Big Data Engineer) 2

Not Interested
Bookmark
Report This Job

profile Job Location:

Austin, TX - USA

profile Monthly Salary: Not Disclosed
Posted on: 1 hour ago
Vacancies: 1 Vacancy

Job Summary

Roles/Responsibilities:

The Worker is responsible for developing maintaining and optimizing big data solutions using the Databricks Unified Analytics Platform.

This role supports data engineering machine learning and analytics initiatives within this organization that relies on large-scale data processing.

Duties include:

  • Designing and developing scalable data pipelines
  • Implementing ETL/ELT workflows
  • Optimizing Spark jobs
  • Integrating with Azure Data Factory
  • Automating deployments
  • Collaborating with cross-functional teams
  • Ensuring data quality governance and security.

Mandatory Skills:

  1. Implement ETL/ELT workflows for both structured and unstructured data- 4years
  2. Automate deployments using CI/CD tools- 4years
  3. Collaborate with cross-functional teams including data scientists analysts and stakeholders- 4years
  4. Design and maintain data models schemas and database structures to support analytical and operational use cases- 4years
  5. Evaluate and implement appropriate data storage solutions including data lakes (Azure Data Lake Storage) and data warehouses- 4years
  6. Implement data validation and quality checks to ensure accuracy and consistency- 4years
  7. Contribute to data governance initiatives including metadata management data lineage and data cataloguing- 4years
  8. Implement data security measures including encryption access controls and auditing; ensure compliance with regulations and best practices- 4years
  9. Proficiency in Python and R programming languages- 4years
  10. Strong SQL querying and data manipulation skills- 4years
  11. Experience with Azure cloud platform- 4years
  12. Experience with DevOps CI/CD pipelines and version control systems- 4years
  13. Working in agile multicultural environments- 4years
  14. Strong troubleshooting and debugging capabilities- 4years
  15. Design and develop scalable data pipelines using Apache Spark on Databricks- 3 years
  16. Optimize Spark jobs for performance and cost-efficiency- 3 years
  17. Integrate Databricks solutions with cloud services (Azure Data Factory) - 3 years
  18. Ensure data quality governance and security using Unity Catalog or Delta Lake- 3 years
  19. Deep understanding of Apache Spark architecture RDDs DataFrames and Spark SQL- 3 years
  20. Hands-on experience with Databricks notebooks clusters jobs and Delta Lake- 3 years

Desirable Skills:

  1. Knowledge of ML libraries (MLflow Scikit-learn TensorFlow)- 1years
  2. Databricks Certified Associate Developer for Apache Spark- 1years
  3. Azure Data Engineer Associate- 1 years
Roles/Responsibilities: The Worker is responsible for developing maintaining and optimizing big data solutions using the Databricks Unified Analytics Platform. This role supports data engineering machine learning and analytics initiatives within this organization that relies on large-scale data pro...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala