Location: Budapest Hungary (hybrid)
Hours: Full Time
About Craft & Chaps
At Craft were redefining productivity through software that feels effortless. Chaps is our AI-first product for personal productivity. We integrate best-in-class model providers and build the infrastructure orchestration and guardrails that make AI interactions fast reliable and human-centric at scale.
Why this role matters
This is a multi-layered platform role: your product is the AI integration and execution fabric that powers Chaps agent runtimes. Youll build the clear interfaces safe defaults guardrails and dependable operations that make the right thing the easy thing. We design systems to evolve: build for today instrument and learn then improve or replace when the data says so. Shipping matters but shipping correct observable SLO-backed systems matters more.
What youll do
Own the LLM gateway/broker for multiple providers: routing fallback retries/backoff timeouts streaming and circuit breakers that prefer correctness and predictable UX over raw speed.
Agent orchestration foundations: resilient workflows state management idempotency and task queues so multi-step systems behave predictably in production.
Retrieval & memory layers: operate and evolve contextual stores/caches to improve quality and latency with graceful degradation paths.
Design for change: modular contracts versioned prompts/configs and migration paths that make replacing components low-risk and fast.
Reliability & performance: define SLIs/SLOs for AI calls and agent workflows; reduce p95/p99 latency without inviting chaotic agent behavior; lead incident response and blameless RCAs.
Cost & quota stewardship: token accounting per-feature/tenant budgets alerts and usage attribution.
Observability end-to-end: metrics/logs/traces prompt/response analytics and dashboards that tie infra signals to user experience.
Evaluation & experiments: offline/online eval harnesses progressive delivery and guardrail pipelines to ship improvements with confidence.
Developer platform: opinionated CI/CD IaC service templates SDKs/clients and docs that make secure observable integrations the default so teams move quickly and safely.
You might be a good fit if you
Have strong software engineering fundamentals and have scaled production apps.
Have scaled integrations with AI APIs and understand token/latency budgets streaming caching and rate-limit management.
Have experience with the Cloudflare stack Postgres/Redis/Vector DBs event systems API gateways secrets management.
Are comfortable letting go of tech choices: deleting code that no longer serves the product is a win.
Balance excitement with pragmatism: you try new tech where it matters and fit it thoughtfully into existing systems.
Communicate clearly and collaborate well with design and product.
Thrive in small high-ownership teams and like moving quickly with high standards.
Our Culture
We sweat details and build platforms that disappear into the background for users and developers.
We ship learn and refine - accepting that some systems are stepping stones.
We prefer durable correctness to fragile speed.
We innovate where it counts and keep a pragmatic eye on the whole product.
Location: Budapest Hungary (hybrid)Hours: Full TimeAbout Craft & ChapsAt Craft were redefining productivity through software that feels effortless. Chaps is our AI-first product for personal productivity. We integrate best-in-class model providers and build the infrastructure orchestration and guard...
Location: Budapest Hungary (hybrid)
Hours: Full Time
About Craft & Chaps
At Craft were redefining productivity through software that feels effortless. Chaps is our AI-first product for personal productivity. We integrate best-in-class model providers and build the infrastructure orchestration and guardrails that make AI interactions fast reliable and human-centric at scale.
Why this role matters
This is a multi-layered platform role: your product is the AI integration and execution fabric that powers Chaps agent runtimes. Youll build the clear interfaces safe defaults guardrails and dependable operations that make the right thing the easy thing. We design systems to evolve: build for today instrument and learn then improve or replace when the data says so. Shipping matters but shipping correct observable SLO-backed systems matters more.
What youll do
Own the LLM gateway/broker for multiple providers: routing fallback retries/backoff timeouts streaming and circuit breakers that prefer correctness and predictable UX over raw speed.
Agent orchestration foundations: resilient workflows state management idempotency and task queues so multi-step systems behave predictably in production.
Retrieval & memory layers: operate and evolve contextual stores/caches to improve quality and latency with graceful degradation paths.
Design for change: modular contracts versioned prompts/configs and migration paths that make replacing components low-risk and fast.
Reliability & performance: define SLIs/SLOs for AI calls and agent workflows; reduce p95/p99 latency without inviting chaotic agent behavior; lead incident response and blameless RCAs.
Cost & quota stewardship: token accounting per-feature/tenant budgets alerts and usage attribution.
Observability end-to-end: metrics/logs/traces prompt/response analytics and dashboards that tie infra signals to user experience.
Evaluation & experiments: offline/online eval harnesses progressive delivery and guardrail pipelines to ship improvements with confidence.
Developer platform: opinionated CI/CD IaC service templates SDKs/clients and docs that make secure observable integrations the default so teams move quickly and safely.
You might be a good fit if you
Have strong software engineering fundamentals and have scaled production apps.
Have scaled integrations with AI APIs and understand token/latency budgets streaming caching and rate-limit management.
Have experience with the Cloudflare stack Postgres/Redis/Vector DBs event systems API gateways secrets management.
Are comfortable letting go of tech choices: deleting code that no longer serves the product is a win.
Balance excitement with pragmatism: you try new tech where it matters and fit it thoughtfully into existing systems.
Communicate clearly and collaborate well with design and product.
Thrive in small high-ownership teams and like moving quickly with high standards.
Our Culture
We sweat details and build platforms that disappear into the background for users and developers.
We ship learn and refine - accepting that some systems are stepping stones.
We prefer durable correctness to fragile speed.
We innovate where it counts and keep a pragmatic eye on the whole product.
View more
View less