Employer Active
- USA
Not Disclosed
Salary Not Disclosed
1 Vacancy
Job Description:
Common data archetypes writing and coding functions algorithms logic development control flow objectoriented programming languages external libraries and how to collect data from different sources.
This includes having knowledge of scraping application program interfaces databases and publicly available repositories.
Structured data such as from relational database management systems and spreadsheets; semi structured data such as log files Extensible Markup Language and JavaScript Object Notation; and unstructured data such as text video audio and images.
Relational databases and NoSQL databases such as Apache Hadoop Apache Spark and other MPP databases.
SQLbased querying of databases using joins aggregations and subqueries.
Opensource tools including realtime data processing products such as Apache Beam Kafka and Spark Structured Streaming; time series databases such as Influx DB; relational databases such as Postgres; graph databases such as Neo4j; and software development environments such as Git and GitHub.
Abstraction tools such as Kubernetes.
Mastery of computer programming and scripting languages such as Scala Java or Python as well as an ability to create programming and processing logic.
Experience with machine learning algorithms and automated machine learning to automate and build continuous learning data processing streams and pipelines.
Data warehousing tools and techniques such as Apache Hive.
Knowledge of cloud platform particularly AWS is also needed.
Full Time