- Develop data pipelines to ingest load and transform data from multiple sources.
- Leverage Data Platform running on Google Cloud to design optimize deploy and deliver data solutions in support of scientific discovery
- Use programming languages like Java Scala Python and Open-Source RDBMS and NoSQL databases and Cloud based data store services such as MongoDB DynamoDB Elasticache and Snowflake
- The continuous delivery of technology solutions from product roadmaps adopting Agile and DevOps principles
- Collaborate with digital product managers and deliver robust cloud-based solutions that drive powerful experiences
- Design and develop data pipelines including Extract Transform Load (ETL) programs to extract data from various sources and transform the data to fit the target model
- Test and deploy data pipelines to ensure compliance with data governance and security policies
- Moving implementation to ownership of real-time and batch processing and data governance and policies
- Maintain and enforce the business contracts on how data should be represented and stored
- Ensures that technical delivery is fully compliant with Security Quality and Regulatory standards
- Keeps relevant technical documentation up to date in support of the lifecycle plan for audits/reviews.
- Pro-actively engages in experimentation and innovation to drive relentless improvement e.g. new data engineering tools/frameworks
- Implementing ETL processes moving data between systems including S3 Snowflake Kafka and Spark
- Work closely with our Data Scientists SREs and Product Managers to ensure software is high quality and meets user requirements
Required Qualifications
- Bachelors or Masters degree in Computer Science Engineering or related field.
- 5 years of experience as a data engineer building ETL/ELT data pipelines.
- Experience with data engineering best practices for the full software development life cycle including coding standards code reviews source control management (GIT continuous integrations testing and operations)
- Experience in programming language Python and SQL good to have Java C# C Go Ruby and Rust
- Experience with Agile DevOps & Automation of testing build deployment CI/CD etc. Airflow
- Experience with Docker Kubernetes Shell Scripting
- 2 years of experience with a public cloud (AWS Microsoft Azure Google Cloud)
- 3 years experience with distributed data/computing tools (MapReduce Hadoop Hive EMR Kafka Spark Gurobi or MySQL)
- 2 years experience working on real-time data and streaming applications
- 2 years of experience with NoSQL implementations (DynamoDB MongoDB Redis Elasticache)
- 2 years of data warehousing experience (Redshift Snowflake Databricks etc.)
- 2 years of experience with UNIX/Linux including basic commands and shell scripting
- Experienced with visualization tools like SSRS Excel PowerBI Tableau Google Looker Azure Synapse
Required Skills : PythonSQL
Basic Qualification :
Additional Skills :
This is a high PRIORITY requisition. This is a PROACTIVE requisition
Background Check : Yes
Drug Screen : No
Develop data pipelines to ingest load and transform data from multiple sources.Leverage Data Platform running on Google Cloud to design optimize deploy and deliver data solutions in support of scientific discoveryUse programming languages like Java Scala Python and Open-Source RDBMS and NoSQL databa...
- Develop data pipelines to ingest load and transform data from multiple sources.
- Leverage Data Platform running on Google Cloud to design optimize deploy and deliver data solutions in support of scientific discovery
- Use programming languages like Java Scala Python and Open-Source RDBMS and NoSQL databases and Cloud based data store services such as MongoDB DynamoDB Elasticache and Snowflake
- The continuous delivery of technology solutions from product roadmaps adopting Agile and DevOps principles
- Collaborate with digital product managers and deliver robust cloud-based solutions that drive powerful experiences
- Design and develop data pipelines including Extract Transform Load (ETL) programs to extract data from various sources and transform the data to fit the target model
- Test and deploy data pipelines to ensure compliance with data governance and security policies
- Moving implementation to ownership of real-time and batch processing and data governance and policies
- Maintain and enforce the business contracts on how data should be represented and stored
- Ensures that technical delivery is fully compliant with Security Quality and Regulatory standards
- Keeps relevant technical documentation up to date in support of the lifecycle plan for audits/reviews.
- Pro-actively engages in experimentation and innovation to drive relentless improvement e.g. new data engineering tools/frameworks
- Implementing ETL processes moving data between systems including S3 Snowflake Kafka and Spark
- Work closely with our Data Scientists SREs and Product Managers to ensure software is high quality and meets user requirements
Required Qualifications
- Bachelors or Masters degree in Computer Science Engineering or related field.
- 5 years of experience as a data engineer building ETL/ELT data pipelines.
- Experience with data engineering best practices for the full software development life cycle including coding standards code reviews source control management (GIT continuous integrations testing and operations)
- Experience in programming language Python and SQL good to have Java C# C Go Ruby and Rust
- Experience with Agile DevOps & Automation of testing build deployment CI/CD etc. Airflow
- Experience with Docker Kubernetes Shell Scripting
- 2 years of experience with a public cloud (AWS Microsoft Azure Google Cloud)
- 3 years experience with distributed data/computing tools (MapReduce Hadoop Hive EMR Kafka Spark Gurobi or MySQL)
- 2 years experience working on real-time data and streaming applications
- 2 years of experience with NoSQL implementations (DynamoDB MongoDB Redis Elasticache)
- 2 years of data warehousing experience (Redshift Snowflake Databricks etc.)
- 2 years of experience with UNIX/Linux including basic commands and shell scripting
- Experienced with visualization tools like SSRS Excel PowerBI Tableau Google Looker Azure Synapse
Required Skills : PythonSQL
Basic Qualification :
Additional Skills :
This is a high PRIORITY requisition. This is a PROACTIVE requisition
Background Check : Yes
Drug Screen : No
View more
View less