JD for JR. Data Scientist/Engineer
Must Haves:
- 3+ years of professional experience as a data engineer
- 3+ years working with Python and SQL.
- Experience with state of the art machine learning algorithms such as deep neural networks,support vector machines, boosting algorithms, random forest etc. preferred
- Experience conducting advanced feature engineering and data dimension reduction in Big Data environment is preferred
- Strong SQL skills in Big Data environment (Hive/ Impala etc.) a plus
Things that would stand out on resume -
- 1- Masters Degree in Computer Science & Data Science
- 2- Previous Company - Any Bank, Ecommerce
Equifax is looking for a Statistical Consultant/Data Engineer to join our world-class Global Identity and Fraud Analytics team. In this exciting role, you will have the opportunity to work on a variety of
challenging projects across multiple industries including Financial Services, Telecommunications,
eCommerce, Healthcare, Insurance and Government. In this position you will:
What You'll Do
- Work with data scientist team to migration the analytical data and projects to GCP environment and ensure the smooth project transition
- Prepare and build data and analytical automation pipeline for self-serving machine learning projects: gather data from multiple sources and systems, integrating, consolidating and cleansing data, and structuring data for use by our clients our client facing projects.
- Design and Code analysis scripts that can run on GCP using BigQuery/Python/Scala leverage multiple EFX Core data sources
- Basic knowledge in graph mining and graph data model is a plus
- Understand best practices for data management, maintenance, and reporting and use that knowledge to implement improvements in our solutions.
Qualifications:
- 3+ years of professional data engineering or data wrangling experience in:
- - working with Hadoop based or Cloud based big data management environment
- - bash scripting or similar experience for data movement and ETL
- - Big data queries in Hive/Impala/Pig/BigQuery (Sufficient in BigQuery API libraries to data prep automation is a plus)
- - Advanced Python programming (Scala is a plus) with strong coding experience and Proficient in data studio, Big Table, GitHub working experience (Cloud composer and Data flow is a plus)
- - basic GCP certification is a plus
- - Knowledge of Kubernetes is a plus (or other types of GCP native tools of the container-orchestration system for automating computer application deployment, scaling, and management)
- - Basic knowledge in machine learning (ensemble machine learning models, unsupervised machine learning models) with experience using Tensorflow and PyTorch is a plus
- - Basic knowledge in graph mining and graph data model is a plus.