Data Engineer Lakehouse and AI Data Platform Warsaw Analyst Associate

Goldman Sachs

Not Interested
Bookmark
Report This Job

profile Job Location:

Warsaw - Poland

profile Monthly Salary: Not Disclosed
Posted on: 18 hours ago
Vacancies: 1 Vacancy

Job Summary

Description

The Opportunity

Join a team building the data foundations that support the firms AI and analytics capabilities. This role sits within the engineering effort to develop a modern Lakehouse and AI data platform that enables reliable well-governed and high-performing data use across the firm.

At Goldman Sachs engineering teams are positioned at the centre of the business building scalable systems solving complex technical problems and turning data into data engineering roles the emphasis is on designing building and maintaining large-scale data platforms delivering production pipelines improving reliability and quality and partnering closely with users of the platform.

This is a delivery-focused role for engineers who want to build robust data assets in production work with modern data technologies and grow over time within the firm. You will contribute to the data models pipelines and platform capabilities that underpin analytics operational decision-making and emerging AI use cases.

Role Summary

As aData Engineer Lakehouse and AI Data Platform you will design build test and support data pipelines and curated datasets on the firms modern data platform. You will work across ingestion transformation modelling optimisation and data quality helping to deliver data products that are reliable scalable and fit for purpose.

The role is suited to engineers who are comfortable writing code working with SQL and distributed data processing and solving practical delivery problems in a team environment. More experienced candidates may also contribute to technical design platform standards and the shaping of delivery approaches across a wider set of use cases.

Key Responsibilities

Pipeline Engineering

  • Build enhance and support batch and streaming data pipelines on the Lakehouse and AI data platform.
  • Refactor or modernise existing data flows where needed to improve reliability performance and maintainability.
  • Ensure data pipelines are production-ready well tested and operationally supportable.

Data Modelling and Curation

  • Develop raw refined and curated datasets that support analytics reporting and AI use cases.
  • Apply sound data modelling principles to represent business entities relationships and historical change accurately.
  • Work with consumers to shape data products that are usable well documented and aligned to business needs.

Data Quality and Reconciliation

  • Implement controls to validate completeness accuracy and consistency of data across pipelines and datasets.
  • Use reconciliation approaches to build confidence in production outputs and investigate breaks where they arise.
  • Contribute to clear standards for testing monitoring and issue resolution.

Delivery and Partnership

  • Work closely with engineers platform teams and data consumers to deliver agreed outcomes to time and quality expectations.
  • Communicate clearly on progress risks dependencies and design choices.

Skills and Experience

Required

  • Bachelors or masters degree in a relevant discipline or equivalent practical experience with evidence of strong quantitative skills or data engineering expertise.
  • Strong hands-on programming experience inPythonorJava.
  • Good working knowledge ofSQL including troubleshooting optimisation and data analysis.
  • Ability to learn new tools internal platforms and delivery workflows quickly.
  • Familiarity with software engineering fundamentals including version control testing release discipline and CI/CD practices.

Data Engineering Capability

  • Understanding of temporal data modelling including the handling of historical state and change over time.
  • Knowledge of schema design schema evolution and data compatibility considerations.
  • Understanding of partitioning clustering and other techniques used to improve data performance at scale.
  • Ability to make sensible design choices across normalised and denormalised models and between natural and surrogate keys.
  • Practical approach to data quality reconciliation and root-cause analysis.
  • Experience building or supporting production data pipelines in a collaborative engineering environment.
  • Experience working with distributed data processing frameworks such asApache Spark.
  • Working knowledge of common data formats such asJSONAvroandParquet.

Technology Environment

The role will involve working with a modern and evolving data stack. Candidates are not expected to have deep expertise in every tool from day one but should bring relevant experience and the ability to work across comparable technologies.

Examples of technologies in scope include:

  • Data processing and logic:ANSI SQL Apache Spark Kafka
  • Data formats:JSON Avro Parquet
  • Platforms and storage:Snowflake Apache Iceberg Databricks Hadoop ecosystem technologies Sybase IQ
  • Engineering and deployment:CI/CD tooling containerised or Kubernetes-based deployment approaches where relevant

You will also work with internal data management and platform tooling so a practical and adaptable engineering mindset is important.

What We Are Looking For

We are looking for engineers who can deliver well-structured reliable solutions in production and who take ownership of the quality of what they build. The role suits candidates who are technically strong pragmatic and comfortable working in a fast-paced environment where data platforms support important business outcomes.

Stronger candidates will typically demonstrate:

  • sound judgement in technical trade-offs
  • attention to detail in data correctness and testing
  • a clear and structured approach to problem solving
  • willingness to work closely with stakeholders and partner teams
  • an interest in developing long-term expertise within the firm
ABOUT GOLDMAN SACHS

At Goldman Sachs we commit our people capital and ideas to help our clients shareholders and the communities we serve to grow. Founded in 1869 we are a leading global investment banking securities and investment management firm. Headquartered in New York we maintain offices around the world.

We believe who you are makes you better at what you do. Were committed to fostering and advancing diversity and inclusion in our own workplace and beyond by ensuring every individual within our firm has a number of opportunities to grow professionally and personally from our training and development opportunities and firmwide networks to benefits wellness and personal finance offerings and mindfulness programs. Learn more about our culture benefits and people at

Were committed to finding reasonable accommodations for candidates with special needs or disabilities during our recruiting process. Learn more:

The Goldman Sachs Group Inc. 2023. All rights reserved.
Goldman Sachs is an equal opportunity employer and does not discriminate on the basis of race color religion sex national origin age veterans status disability or any other characteristic protected by applicable law.




Required Experience:

IC

DescriptionThe OpportunityJoin a team building the data foundations that support the firms AI and analytics capabilities. This role sits within the engineering effort to develop a modern Lakehouse and AI data platform that enables reliable well-governed and high-performing data use across the firm.A...
View more view more

About Company

The Goldman Sachs Group, Inc. is a leading global investment banking, securities, and asset and wealth management firm that provides a wide range of financial services.

View Profile View Profile