Can Vorwerk's brand promise survive without the brand artifact?
Gelinggarantie has always been hardware-bound — the Thermomix on the counter is what makes "success guaranteed" credible. Phase 2 built an actual software artifact in its place. The test is whether that artifact can carry the same promise.
The artifact under test, and why each piece matters for Vorwerk's question
Each component is the software equivalent of a mechanism that makes hardware Gelinggarantie credible.
Curated 53-recipe BBQ corpus (cost-tier, equipment, dietary, flavor metadata) — the content layer. The BBQ equivalent of Cookidoo: validated, grounded knowledge the AI draws from. Without it the AI hallucinates; with it, it speaks from a real corpus.
Four behavior files (home / plan-from-filter / plan-from-recipe / assemble) — the flow layer. Translates Thermomix's fixed sequence (select → ingredients → steps → result) into a conversational arc from intent to plan.
Discovery determinism + completeness gate — the reliability layer. Codified question order and completeness thresholds so the AI behaves like an instrument, not a chatbot. What makes the AI trustworthy the way hardware is trustworthy.
Six tool-driven UI primitives + surface-aware state — the constraint layer. Structured tools (present_choice, present_recipe_cards, route_to, …) keep the AI on a safe path at every key decision point.
Built collaboratively: Igor (agent behavior + corpus), Jürgen (tools + design system), MAD (design + UX).
We built a working AI planning agent — not a clickable demo. Does this artifact, in actual self-directed use, produce the response Phase 1 documented in moderated discussion?
Without a researcher in the room. Without hardware on the counter. With Vorwerk delivering only software.
(a) Category permission without hardware — can Vorwerk credibly enter BBQ?
(b) Digital Gelinggarantie — can AI companionship deliver the success-guarantee experience that hardware delivered before?
Nested, not parallel: (a) collapses into (b). If the AI companion reliably produces the trust-and-confidence response, Vorwerk has permission — because they're still delivering the core promise, just through a different medium.
Boswell — AI conducting voice interviews with structured insight extraction — is itself an artifact in the category Vorwerk is investing in.
Using AI to do the emotional/relational work of post-experience research IS a demonstration of the capability that motivates the Ember investment in the first place.
The instrument embodies the question.
Measured indirectly via:
Not via direct interrogation. We observe the chain's behavior rather than ask "do you accept Vorwerk in BBQ?"
Measured via Boswell topics 1–5:
F&F sample is acceptable here — this is a product-effect read, not a market-attitude read.
Cohort tag (MAD F&F / Vorwerk insider / Vorwerk-initiated warm referral) captured per participant at Boswell briefing step — enables cohort splits in analysis without dictating Vorwerk's recruitment mix.
Slippage lands in mid-June as follow-on rather than gating the early-June close.
Research-informed, conversational, adaptive. Briefing materials bias question generation toward these eight topics; Boswell conducts adaptively.
Bundled session is the default. Async fallback 24–48h with one reminder. Vorwerk-branded prototype from first screen. "Things will break" explicit framing on the landing page.
Coding dimensions (pre-specified):
No pre-committed numerical thresholds. Strength of claim to Julius follows from strength of qualitative pattern observed.
Cohort splits (MAD F&F / Vorwerk insider / Vorwerk-initiated) reported where they reveal a difference.
Sample limits: product-effect read, NOT market-attitude read. Stated upfront in deliverable.