Atlas Invest | Senior Data-Focused Backend developer

SD Solutions

Job Location:

Warsaw - Poland

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Department:

Software Development

Job Summary

On behalf of Atlas Invest SD Solutions is looking for a talented Senior research-oriented Data Engineer / Data-Focused Backend developer who can take a feature idea from concept through research data validation modeling approach and full implementation. You will play a key role in designing developing and maintaining our core services with a focus on performance reliability and scalability.

SD Solutions is a staffing company operating globally. Contact us to get more details about the benefits we offer.

As a Data-Focused Backend developer you will own the full arc from idea to impact. End-to-end here isnt just a buzzword; it means you translate abstract problems into testable hypotheses. It means the same person who reads a paper on hybrid document classification prototypes it in a notebook evaluates it with DSPy metrics wires it into a LangGraph node and deploys it into our production Python/TypeScript monorepo.

You will bridge the gap between abstract research and concrete engineering. You wont stop at a notebook win or just build isolated models; you will build the pipelines FastAPI services and TypeScript integrations that serve them to the real world ensuring reliability and measurable business value. We are looking for a rockstar who can seamlessly navigate the boundary between high-level AI orchestration and low-level system reliability.

Your First 90 Days

Month 1: Codebase Mastery & First Shipped Wins

Get fully onboarded by successfully running the monorepo locally and tracing a live data request through our core AI and data services within your first few days.
Ship your first pipeline improvement to production (e.g. an extraction fix or a schema normalization) by the end of Week 1.
Reproduce a notebook experiment publish a short gap analysis and transition your first DSPy or LangGraph prototype into a tested FastAPI service.

Month 2: Pipeline Ownership & The Research Flywheel

Take end-to-end ownership of a complex pipeline component (like due diligence intelligence or multi-source data fusion).
Deliver a new evaluation harness tied to a live pipeline and immediately use it to measure and drive a real-world performance increase.
Productionize a research-driven upgrade (like a new DSPy optimizer strategy) with clear before/after metrics.

Month 3: Architecture & Scale

Lead the architecture of a next-generation research initiative (e.g. advanced GraphRAG or a new autonomous diligence agent) from abstract idea to production deployment.
Define and accelerate a repeatable research-to-release playbook for your domain setting the standard for how we bridge AI research and production engineering.

What You Will Own

AI Extraction Pipelines: Design and ship improvements to the OCR Classify Extract pipeline (using PaddleOCR LangGraph DSPy) to reduce extraction error and latency for complex document types like T12 financials rent rolls and appraisals.
Scale Data Normalization: Expand our property data aggregation layer. You will pull data from various top-tier real estate and demographic APIs optimizing schema normalizations and conflict resolution to unify external datasets with our internal systems.
Strengthen Automated Risk Engines: Improve the underlying engine to generate smarter cleaner and higher-quality risk assessments.
Optimize Property Intelligence Pipelines: Enhance automated data enrichment to deliver instantaneous actionable insights on asset-specific attributes and external risk factors.
External Provider Resilience: Expand and maintain our TypeScript-based provider ecosystem ensuring reliability against third-party outages via robust caching retries and observability.
Drive the Research Flywheel: Conduct systematic gap analyses using custom evaluation suites (accuracy/precision-recall) on current modules. You will identify the next 2-3 bottlenecks feed them back into the engineering loop and implement academic approaches (e.g. SOTA advanced chunking multi-step RLM reasoning) to continuously boost precision and recall.
Orchestrate Agentic Workflows: Use LangGraph to build complex fault-tolerant state machines that connect our document classification OCR and schema extraction modules.

What hard skills do we need

Note: We dont expect you to have every single skill listed below-thats nearly impossible. We value equivalent skills and a proven ability to learn fast especially when it comes to specific technologies like DSPy or Neo4j Cypher.

Languages: Python 3.12 (FastAPI/Pydantic) TypeScript (Strict mode/Zod) SQL/Cypher and the newest programming language -> English.
AI/ML/LLM Systems: Prompts/DSPy optimization LangGraph orchestration vector retrieval (Weaviate Elastic or alternatives) prompt/eval loops and multi-model integrations (OpenAI Gemini vLLM).
Data & Graphs: Neo4j modeling schema design multi-source data fusion and ORMs (SQLAlchemy Prisma or Drizzle are an advantage).
Document Intelligence: Working with pre-implemented OCR pipelines document parsing and classification under noisy real-world inputs/files/tables.
Production Engineering: Monorepo tooling Docker/Docker-compose message queues (RabbitMQ or others) and observability (tracing structured logging).
Experimentation: Comfortable in Jupyter Notebooks for rapid prototyping benchmark/evaluation harnesses reproducible experiments and A/B metric tracking.

Core Responsibilities:

Identify and onboard new data sources
Perform data comparisons & validation
Assess data quality and usability
Define the modeling approach
Implement and productionize solutions
Work independently with minimal structure

The Team X @ Atlas Mission & Culture

Atlas Invests Team X is building the intelligence layer for real estate. We ingest normalize and reason over the messiest data in one of the worlds largest asset classes property records scattered across multiple external providers complex ownership networks buried in public filings and financial details locked inside massive unstructured rent rolls and appraisals.

Team X is a diverse high-performing squad of engineers and researchers within Atlas. We value ownership velocity and craftsmanship. We ship a polyglot monorepo and treat the boundary between research and production as a feature not friction. You will join a culture where people are trusted to run with ambiguity publish Jupyter experiments on Monday and deploy those results to production by Friday.

About the company:

Atlas Invest is transforming the bridge loan landscape seamlessly connecting investors with real estate developers using advanced big data analytics for a personalized investment experience.

By applying for this position you agree to the terms outlined in our Privacy Policy. Please take a moment to review our Privacy Policy and make sure you understand its contents. If you have any questions or concerns regarding our Privacy Policy please feel free to contact us.

Required Experience:

Senior IC

SD Solutions is a staffing company operating globally. Contact us to get more details about the benefits we offer.

Your First 90 Days

Month 1: Codebase Mastery & First Shipped Wins

Get fully onboarded by successfully running the monorepo locally and tracing a live data request through our core AI and data services within your first few days.
Ship your first pipeline improvement to production (e.g. an extraction fix or a schema normalization) by the end of Week 1.
Reproduce a notebook experiment publish a short gap analysis and transition your first DSPy or LangGraph prototype into a tested FastAPI service.

Month 2: Pipeline Ownership & The Research Flywheel

Take end-to-end ownership of a complex pipeline component (like due diligence intelligence or multi-source data fusion).
Deliver a new evaluation harness tied to a live pipeline and immediately use it to measure and drive a real-world performance increase.
Productionize a research-driven upgrade (like a new DSPy optimizer strategy) with clear before/after metrics.

Month 3: Architecture & Scale

Lead the architecture of a next-generation research initiative (e.g. advanced GraphRAG or a new autonomous diligence agent) from abstract idea to production deployment.
Define and accelerate a repeatable research-to-release playbook for your domain setting the standard for how we bridge AI research and production engineering.

What You Will Own

AI Extraction Pipelines: Design and ship improvements to the OCR Classify Extract pipeline (using PaddleOCR LangGraph DSPy) to reduce extraction error and latency for complex document types like T12 financials rent rolls and appraisals.
Scale Data Normalization: Expand our property data aggregation layer. You will pull data from various top-tier real estate and demographic APIs optimizing schema normalizations and conflict resolution to unify external datasets with our internal systems.
Strengthen Automated Risk Engines: Improve the underlying engine to generate smarter cleaner and higher-quality risk assessments.
Optimize Property Intelligence Pipelines: Enhance automated data enrichment to deliver instantaneous actionable insights on asset-specific attributes and external risk factors.
External Provider Resilience: Expand and maintain our TypeScript-based provider ecosystem ensuring reliability against third-party outages via robust caching retries and observability.
Drive the Research Flywheel: Conduct systematic gap analyses using custom evaluation suites (accuracy/precision-recall) on current modules. You will identify the next 2-3 bottlenecks feed them back into the engineering loop and implement academic approaches (e.g. SOTA advanced chunking multi-step RLM reasoning) to continuously boost precision and recall.
Orchestrate Agentic Workflows: Use LangGraph to build complex fault-tolerant state machines that connect our document classification OCR and schema extraction modules.

What hard skills do we need

Languages: Python 3.12 (FastAPI/Pydantic) TypeScript (Strict mode/Zod) SQL/Cypher and the newest programming language -> English.
AI/ML/LLM Systems: Prompts/DSPy optimization LangGraph orchestration vector retrieval (Weaviate Elastic or alternatives) prompt/eval loops and multi-model integrations (OpenAI Gemini vLLM).
Data & Graphs: Neo4j modeling schema design multi-source data fusion and ORMs (SQLAlchemy Prisma or Drizzle are an advantage).
Document Intelligence: Working with pre-implemented OCR pipelines document parsing and classification under noisy real-world inputs/files/tables.
Production Engineering: Monorepo tooling Docker/Docker-compose message queues (RabbitMQ or others) and observability (tracing structured logging).
Experimentation: Comfortable in Jupyter Notebooks for rapid prototyping benchmark/evaluation harnesses reproducible experiments and A/B metric tracking.

Core Responsibilities:

Identify and onboard new data sources
Perform data comparisons & validation
Assess data quality and usability
Define the modeling approach
Implement and productionize solutions
Work independently with minimal structure

The Team X @ Atlas Mission & Culture

About the company:

Atlas Invest is transforming the bridge loan landscape seamlessly connecting investors with real estate developers using advanced big data analytics for a personalized investment experience.

Required Experience:

Senior IC

Key Skills

Apply Now

About Company

SD Solutions

On behalf of SciPlay, SD Solutions is looking for a talented Technical Artist. We are looking for a Technical Artist for one of our flagship titles at SciPlay. This team member would be responsible for integrating art and UI into the Unity engine and then bringing it to life with dyn ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Atlas Invest | Senior Data-Focused Backend developer

Warsaw - Poland

Department:

Job Summary

Your First 90 Days

What You Will Own

What hard skills do we need

Core Responsibilities:

Your First 90 Days

What You Will Own

What hard skills do we need

Core Responsibilities:

Key Skills

About Company

Related Jobs