Plan: Laureo AI — personalized prompts (precomputed) + interactive capability tour [REVISED]

resolved-cell: full·dynamic regime: high

Created: 2026-06-16 (rev 2)

Branch: main

Worktree: (none — authored on main; /implement should create one via /worktree)

PLANNING PHASE ONLY — NOT IMPLEMENTED. Nothing built. Owner greenlight required before /implement.

REVISION 2 — supersedes rev 1 after owner review. Driven by 4 verified research threads (3 code-grounded, 1 external cited): (a) the "engine already exists" claim was misleading — corrected below; (b) re-architected around background precomputation (owner's ask: prompts in the DB *before* login, page just reads); (c) added an org/data-personalized library section; (d) added a major new pillar — the interactive first-run capability tour. Every design call is backed by code evidence or a cited source (see Evidence base).

The problem (owner's words)

Owner: *"The current prompts are extremely generic and not personalized… the few prompts we display need to immediately show the powerful capabilities of the Laureo AI assistant so that users see it's not just a chatbot like ChatGPT that is disconnected from your work tools."* Plus, on review: *"you mentioned the prompt engine already exists but I personally do not see it… the cached/instant prompts should always be cached in the background, not computed in real time… the data is already in our database prior to the user even logging in."* Plus two new asks: a personalized library section (org/own-data), and an interactive onboarding capability tour — a dismissible first-run experience ("Step 1 of 5") that demonstrates the breadth of the assistant on manufactured data with clickable demos, because *"users don't know its breadth of capabilities… if they're not nudged, they're just not going to get there."*

CORRECTED current-state truth (verified, file:line)

My earlier "the engine renders on detail pages" was misleading. The verified truth:

The LLM opener-chip engine (/api/ai/opener-chips + lib/hooks/useOpenerChips.ts) is real and correctly wired to company/person/opportunity context (lib/hooks/useAIPageContext.ts:12-14 → GlobalAIPanel.tsx:167-171) — but it only renders inside a CLOSED floating bubble the user must first open. GlobalAIPanel is mounted once globally (app/AuthenticatedChrome.tsx:341) and returns null until opened (GlobalAIPanel.tsx:181-184: skips / and /ai, returns null unless isOpen, which defaults false). The launcher is a separate "Ask Laureo" FAB (components/ai/AIPanelTrigger.tsx:35-56) or Cmd/Ctrl+.. Chips appear only in the empty state (AIPanel.tsx:279,381). The owner saw nothing because the panel was never opened — not broken, not seat-gated to blank. (Every gate — seat/billing/timeout/restricted — degrades to *fallback chips* or a subscribe overlay, never to literal nothing; the only blank path is a 429 that the client still backfills with static defaults.)
The main /ai page and the home-dashboard widget never pass suggestionChips, so they render the 6 hardcoded, sales-only, all-"analyze-my-data" EMPTY_STATE_CHIPS (AIPanel.tsx:59-66). This is the surface to fix.
Net: this is ~60% re-target/redesign + ~40% net-new (the precompute pipeline and the tour are new).

Locked decisions

Owner-locked (rev 1): Engine = Hybrid (rules + LLM) · Library "More" = in-chat slide-over drawer · Pinning ("make it mine") = in v1.

Owner-locked (rev 2): Build sequencing = all four pillars together in a single v1 ship · Tour launch = opt-in dismissible card · Tour demo data = scripted fake data only (no DB writes, no live LLM call during the tour).

Research-driven design choices folded in this revision:

Hybrid now means precomputed in the background (rules-for-all + LLM-for-active), stored in the DB, read instantly at load — NOT computed at page load (owner's ask, and the aiToday/ai_user_writing_styles precedent supports it).
Prompts must name real CRM records (the #1 evidence-backed lever for clicks — NN/g) and never dead-end on empty data (the AI-specific tour hazard).
The tour is an opt-in, dismissible card (never a forced modal — strong UX consensus), ≤5 steps, interactive on manufactured data.

Evidence base (the proof — full URLs in the Citations section)

Suggested prompts get ignored when done wrong. NN/g's Amazon "Rufus" study: *"None of our participants proactively clicked a prompt suggestion."* Causes: competing mental model (a search bar), generic phrasing, surprise destination, weird icon/label. → Counter each: keep prompts inside the dedicated assistant, personalize with real specifics, do exactly what the label says, plain language. [NN/g Rufus]
Specific + personalized wins. NN/g: *"Broad or generic prompt suggestions are rarely effective… Specific and targeted suggestions… are more likely to lead to meaningful interaction."* Placement *"near the text input field."* Context-aware to role/history. [NN/g use-case prompts]
Follow-up suggestions are the highest-value type (post-answer, tailored). Empty-state "use-case" prompts are lower-yield → must be made specific to earn the click. [NN/g prompt-suggestion taxonomy]
The "gulf of envisioning." Users can't envision what an LLM can do (capability/instruction/intentionality gaps). Remedy: suggest concrete prompt ideas, domain-specific entry points, show explainable output. [CHI 2024] Reinforced by NN/g's "articulation barrier" (*<20%* of people are fluent enough for bare prompt boxes → use chips/GUI, not an empty field).
People underuse AI. ~65% of US workers use AI little/none (Pew); ~23% weekly work use (NBER). The breadth users never discover unaided is real → nudging is justified.
Closest analogs validate the thesis. HubSpot Breeze grounds every answer in real CRM data (the differentiator); Intercom Fin demos on the tenant's own/derived data before commit; Copilot uses a curated prompt gallery + ≤10-prompt welcome; Gemini uses a 4-card empty state. [vendor docs]
Tours: short, skippable, interactive, never dead-end. 3–5 steps (completion falls off past 5); visible Skip on every step; action-driven ("show, don't tell") beats passive tooltips; forced modals and highlight-everything are anti-patterns; AI-specific hazard = a step that dead-ends on empty data → seed/label sample data. [Thinkific, Appcues, Userpilot — magnitudes directional]

Pillar 1 — Precomputed, personalized starter prompts (the engine)

1a. Background precompute architecture (owner's core ask)

Build it the ai_user_writing_styles way (verified precedent), not the lazy-Redis way:

Storage — a dedicated table ai_starter_prompts (clone ai_user_writing_styles, migration 20260413140000_email_ai_overhaul.sql:118-160):

  user_id UUID PRIMARY KEY → profiles(user_id) ON DELETE CASCADE
  organization_id BIGINT NOT NULL → organizations ON DELETE CASCADE
  prompts JSONB NOT NULL DEFAULT '[]'      -- [{id,label,prompt,category}], the visible set + a few extra
  schema_version INT NOT NULL DEFAULT 1
  source TEXT NOT NULL DEFAULT 'rules'     -- 'rules' | 'llm'
  last_computed_at TIMESTAMPTZ
  next_refresh_at TIMESTAMPTZ DEFAULT NOW()
  enabled BOOLEAN DEFAULT true
  created_at/updated_at TIMESTAMPTZ DEFAULT NOW()
  -- partial index (next_refresh_at) WHERE enabled; index (organization_id); own-rows RLS USING (user_id = auth.uid())

Why a table not Redis: the owner wants the set present before login; aiToday's Redis day-key is lazily populated on first request (aiToday.ts:418-434) and lib/CLAUDE.md bans Redis for user data in SSR. A DB column is read in the existing page prefetch path with zero added LLM.

Compute module lib/ai/starterPrompts.ts — model on lib/home/aiToday.ts (the closest behavioral precedent — same "rules substrate + once-a-day LLM narration" shape):
buildStarterPromptsSubstrate(ctx) — pure rules, no LLM, free. Selection algorithm in 1b. Reuses the cheap RPCs get_action_items / get_work_queue_counts (already used by aiToday, via lib/homeQueries.ts).
llmRefineStarterPrompts(...) — upgrades ≤2 llmUpgradable slots to name real records, via getOrgAIConfig → getModelForComplexity('micro') → getAiEnforcementResult → callAI → trackAIUsage → settleReservation (the canonical billing chain). Reuse PROMPTS.openerChips + parseChipsNDJSON.

Refresh cron app/api/cron/refresh-starter-prompts/route.ts — clone refresh-writing-styles/route.ts: verifyCronRequest first, createAdminClient, select next_refresh_at < now() .limit(50), skip restricted orgs via getRestrictedOrgIds(), group by org for getOrgAIConfig, push next_refresh_at on failure, maxDuration=300. Register { path:'/api/cron/refresh-starter-prompts', schedule:'0 2 * * *' } in scripts/setup-qstash-schedules.ts and run it (QStash, never vercel.json crons).

Cost bounding (proven by aiToday + writing-styles): rules-for-ALL (free, every user, the permanent fallback) + LLM-refine only ACTIVE users in paying/non-restricted orgs (~$0.05/run on micro tier, restricted-skip, billing-gated).
Active-user signal — the one blocker: there is no queryable last_sign_in column on profiles. Resolution (default): a session-start top-up via the existing after() hook in app/api/auth/callback/route.ts (it already runs createUserSession + after() background work on every login) — fire-and-forget "refresh my starter prompts if next_refresh_at is past." This makes "active" definitionally correct (they just logged in) and spends nothing on dormant users. (Fallback option B: add profiles.last_active_at and filter in the cron. user_sessions.last_active_at from lib/sessionTracking.ts also exists if a join is preferred.)

Provisioning seed: write the rules substrate row once at org/user provisioning (source='rules', next_refresh_at=now()), "write only when empty" — so it's in the DB before first login.

Consumer change (zero LLM at load): app/ai/page.tsx AIPrefetch (already does a Supabase read into swrFallback, lines 62-90) adds a read of ai_starter_prompts → seeds the chip set. AIPanel empty state renders the stored set; EMPTY_STATE_CHIPS stays as the ultimate client fallback. The page just reads a column — no compute, no LLM, no TTFB hit. Same engine feeds the home widget (DashboardAIPanel) and the detail-page bubble.

1b. Selection algorithm (`lib/ai/promptLibrary/select.ts`, pure)

selectStarters(ctx) -> Prompt[]   // ctx = {role, seatType, features, counts, timeOfDay, activationScore, prefs}
  candidates = CATALOG
    .filter(roles.includes(role ?? 'sales_rep'))
    .filter(!requiresFeature || features.includes(requiresFeature))
    .filter(!nonEmptyKey || counts[nonEmptyKey] > 0)          // NEVER dead-end (evidence #7)
    .filter(!prefs.hidden.includes(id))
    .concat(prefs.custom.map(toCandidate))
  score = roleAffinity + timeBonus + activationFit
  visible = resolvePinned(prefs.pinned).slice(0,6)             // user pins first
  fill remaining with CAPABILITY-DIVERSITY constraint until >=4 categories, then by score
  ensureActionSlot(visible)                                    // >=1 of DO/SEND/SCHEDULE/GET_PAID/AUTOMATE
  if activationScore<=1 OR counts all zero: use the "prove-it"/setup prompts (empty-data fallback)
  return visible.slice(0,6)

Default 6 visible (evidence: ~3–5 is the consensus; 6 matches the current grid and the owner's "top 6" — treat as the A/B-testable default, not a constant). Time-of-day from timezone (morning→plan-my-day; eod→log-today; friday→forecast).

1c. The capability-diversity rule (kills "it's just ChatGPT")

The visible set is forced to span ≥4 of 7 capability categories and always include ≥1 action category, so the user *sees* it can act, not just chat:

|---|---|---|---|

The LLM layer's job is precisely to inject the real record names into llmUpgradable slots (the NN/g specificity lever).

1d. The catalog (`lib/ai/promptLibrary/catalog.ts`, typed, code-defined)

~40–55 entries, each {id,label,prompt,category,jobCategory,roles[],requiresFeature?,nonEmptyKey?,timeOfDay?,minActivation?,llmUpgradable?,proveIt?}. Labels outcome-phrased, plain language (evidence #1). Reuse the 6 current chips as sales_rep seeds.

Pillar 2 — The prompt library drawer (in-chat "More")

A "Browse all prompts" affordance under the 6 → a slide-over PromptLibraryDrawer over the chat (no new route).
Grouped by job-to-be-done: Prospect & qualify · Advance deals · Close & win-back · Follow up & communicate · Get paid · Plan my day/week · Report up · Keep data clean · Put it on autopilot.
NEW — "For you" personalized section (owner ask): a top section driven by the same precomputed engine — prompts that name the user's own org/data (real deals/contacts/invoices), distinct from the generic catalog. This is the Breeze/Fin "grounded in your real data" pattern.
"Yours" section: pinned + user-authored custom prompts.
Each group header carries a one-line "what you'll get" (library doubles as discovery).

Pillar 3 — Pinning / "make it mine" (v1)

Per the Per-User Feature State on JSONB convention. Add profiles.ai_prompt_prefs JSONB NOT NULL DEFAULT '{}':

{ "schema_version": 1,
  "pinned": ["<id>"], "hidden": ["<id>"],
  "custom": [{ "id":"<uuid>", "label":"…", "prompt":"…", "created_at":"…" }],
  "capability_tour": { "dismissed_at": null, "last_step": 0, "completed_at": null } }

(Putting tour state here, not in onboarding_state, keeps all AI-assistant per-user state in one column and avoids editing the onboarding_state hydrator.) Helpers lib/ai/promptPrefs.ts: hydratePromptPrefs, buildInitialPromptPrefs, resolveVisible. API app/api/ai/prompt-prefs/route.ts PATCH (withAuth): pin/reorder/hide, add/edit/delete custom, dismiss tour. Validation: pinned≤12, custom≤20, label≤60, prompt≤600, strip control chars, org+own-user scoped. On change → set ai_starter_prompts.next_refresh_at=now() so the next read reflects pins.

Pillar 4 — Interactive first-run capability tour (NET-NEW)

Goal: nudge users to discover the breadth of the assistant (evidence #4/#5), via a short, skippable, *interactive* demo on *manufactured* data — without polluting the org or costing LLM calls.

4a. Architecture — self-contained, deterministic, free

The chat hook has no scripted-turn API (useAIChat only has sendMessage, which always fires a live billed LLM call — verified). BUT the card components are pure presentational and drivable with hand-built objects + callbacks at zero LLM cost: ArtifactCard + EmailDraftEditor (the email-draft demo), ApprovalCard, ActionResultCard ("Open Draft" chip), MarkdownRenderer, FollowupChips (clickable next-steps). So:

Build a CapabilityTour component that owns local step state and renders the real card UI with scripted Artifact/chip objects. It is deterministic, free, and looks identical to the real assistant because it *is* the real card components.
Manufactured data without DB writes: use the existing sample-data fixture strings (lib/onboarding/sampleDataFixture.ts — Acme/Globex-style names) as the *display data* in the scripted cards. Nothing is written to the tenant. (Loading real sample rows stays the separate opt-in path.)
"Step N of M": reuse StepperDots (components/wizard/WizardShell.tsx:78).
Dismiss/resume: ai_prompt_prefs.capability_tour (cross-device; Pillar 3). A "?" affordance re-opens it.

4b. Step design (≤5, one per capability category — evidence #7)

A 5-step arc that shows breadth, each a *clickable* demo on sample data ending in a "try the next one" chip:

Know — "Here's your pipeline at a glance" (scripted summary card).
Send — "I can draft an email" → click → a real ArtifactCard/EmailDraftEditor renders a draft to a sample contact; user can open/review it.
Do — "I can create records" → click → an ApprovalCard shows a scripted create_task/create_opportunity preview.
Get paid / Schedule (role-branched) — "I can chase an invoice" or "I can book a meeting" → scripted card.
Automate — "I can do this on a schedule, for you" (agent template teaser) + a finale that hands off to the real /ai composer with a starter prompt prefilled (the user chooses to send it — the tour itself fires no live AI call, per the scripted-only decision).

Each step has a visible Skip and a "Step k of 5". Role-branched content where cheap (rep vs manager vs service).

4c. Launch & placement (evidence #7 — never force)

Opt-in, dismissible card (default), not an auto-modal: on first /ai visit (and as a step in the onboarding wizard), show a card *"👋 New here? See the 5 things Laureo can do for you →"*. Clicking starts the tour; an X dismisses it forever (dismissed_at). This matches the owner's "a first-time prompt that once clicked gets dismissed."
Two placements (owner ask): (1) inside the onboarding wizard (reuse WizardShell/StepperDots); (2) the dismissible card on the /ai empty state (mount in app/ai/AIPageClient.tsx, gated on capability_tour.dismissed_at == null && completed_at == null).
Never dead-end: because the tour runs on sample-fixture strings, it always shows full, believable results even for a brand-new empty tenant.

Files

New: lib/ai/promptLibrary/{types,catalog,select}.ts · lib/ai/promptPrefs.ts · lib/ai/starterPrompts.ts (substrate + LLM refine) · app/api/cron/refresh-starter-prompts/route.ts · app/api/ai/prompt-prefs/route.ts · app/components/ai/PromptLibraryDrawer.tsx · app/components/ai/CapabilityTour/ (component + scripted step registry + sample-fixture demo data) · tests (select diversity/action-slot/gating/pinned/empty-fallback; promptPrefs hydrate; catalog well-formed; tour step registry; cron smoke).

Modified: app/components/ai/AIPanel.tsx (empty state reads precomputed set + pin icon + "Browse all" + first-run tour card) · app/ai/page.tsx + AIPageClient.tsx (prefetch ai_starter_prompts; thread role; mount tour card) · app/components/ai/DashboardAIPanel.tsx (inherit) · app/api/auth/callback/route.ts (session-start top-up in the existing after()) · scripts/setup-qstash-schedules.ts (register cron) · provisioning path (seed rules row) · lib/onboarding/ wizard (tour step) · lib/ai/chatFlags.ts + lib/ai/CLAUDE.md (doc-drift fix + document the engine).

Migrations (both additive/idempotent, auto-apply): <ts>_ai_starter_prompts.sql (the table) · <ts>_profiles_ai_prompt_prefs.sql (the prefs column).

Acceptance criteria

/ai empty state shows a role-appropriate, precomputed set read from the DB with zero LLM at load (verified: page reads ai_starter_prompts, no synchronous model call).
Visible set spans ≥4 capability categories incl. ≥1 action (unit-proven); llmUpgradable slots name real records when data exists.
No visible prompt dead-ends; empty/new tenant falls back to capability/setup prompts, never a blank or empty-result prompt.
Prompts are precomputed in the background (cron + provisioning seed + session-start top-up) and refreshed; LLM cost is bounded to active, paying-org users and routed through the billing chain; restricted orgs spend $0.
User can pin/reorder/hide/author prompts (persisted, cross-device); pins surface first; changes trigger a refresh.
The drawer opens over chat with job-category groups, a "For you" real-data section, and a "Yours" section.
The capability tour: opt-in dismissible card; ≤5 interactive steps on sample-fixture data (no DB writes, no LLM); visible Skip each step; dismiss/complete persists; available in onboarding AND on /ai first-run; never dead-ends.
tsc 0 · next build PPR-clean · vitest green incl. new tests · every query org-scoped + .limit(), no select('*'), count:'estimated' · cron uses verifyCronRequest + restricted-skip.
Doc-drift fixed; lib/ai/CLAUDE.md documents the precompute engine + tour.

Edge cases

Unknown role → sales_rep pool. New/empty org → capability/setup prompts + the tour still fully demos on sample strings. Collaborator/viewer seat → no LLM refine (rules only), tour still available. Redis/LLM/billing failure → rules set (or EMPTY_STATE_CHIPS). Pinned id removed from catalog → skip silently. Custom prompt over caps/control chars → 400. messages.length>0 → starters/tour card hidden. Cron partial failure → per-user next_refresh_at pushed, retried next run.

Phasing — single v1 ship (owner-locked: build all pillars together)

v1 = all four pillars in one ship. Pillar 1 (precompute engine: table + cron + rules/LLM + provisioning seed + session-start top-up; selection + diversity + non-empty + record-naming) · Pillar 2 (drawer incl. "For you" + "Yours") · Pillar 3 (pinning/custom) · Pillar 4 (interactive capability tour: tour component, scripted step registry, opt-in dismissible card, onboarding + /ai placements) · home-widget inheritance · tests · doc-drift fix.

Suggested internal build order (dependency ordering for /implement, NOT separate ships): migrations → precompute engine + selection algorithm → page/panel wiring + drawer → pinning/prefs → capability tour. File-disjoint where possible; all gated and shipped together.

Cost & scale

Pre-launch, tiny scale. Cron envelope .limit(50) AI/run like writing-styles (~$0.05/run, micro tier); rules layer free for all. No new env flag required (openers already render unconditionally); optional default-ON kill-switch in chatFlags.ts reverting to EMPTY_STATE_CHIPS.

Risks

Catalog & tour-script quality is the product — budget real copy/product effort on the ~40–55 prompts and the 5 tour steps (this is where the "wow" lives, not the plumbing).
The "0% click" trap — mitigated by record-specific phrasing, in-panel placement, plain language, do-what-it-says, never-dead-end (evidence #1).
LLM precompute cost — bounded to active/paying users, micro tier, restricted-skip, billing-gated; rules layer is the zero-cost floor.
Tour annoyance — opt-in + visible Skip + dismiss-forever (evidence #7).

Citations (proof)

NN/g Rufus "none clicked": https://www.nngroup.com/articles/discoverability-ai-amazon/
NN/g prompt-suggestion types: https://www.nngroup.com/articles/prompt-suggestions/
NN/g use-case prompt design (specificity/placement): https://www.nngroup.com/articles/designing-use-case-prompt-suggestions/
NN/g articulation barrier (<20%): https://www.nngroup.com/articles/ai-articulation-barrier/
CHI 2024 "gulf of envisioning": https://dl.acm.org/doi/full/10.1145/3613904.3642754
NBER GenAI adoption: https://www.nber.org/papers/w32966 · Pew: https://www.pewresearch.org/short-reads/2026/03/12/key-findings-about-how-americans-view-artificial-intelligence/
Tours: https://www.thinkific.com/blog/product-tour-best-practices/ · https://www.appcues.com/blog/product-tours-ui-patterns · https://userpilot.com/blog/interactive-walkthrough-vs-product-tour/
Analogs: HubSpot Breeze https://www.hubspot.com/products/artificial-intelligence/breeze-ai-assistant · Intercom Fin testing https://www.intercom.com/help/en/articles/10521711-test-fin-ai-agent · M365 Copilot prompt gallery https://learn.microsoft.com/en-us/microsoft-365/copilot/copilot-prompt-gallery
Code precedents: lib/home/aiToday.ts · ai_user_writing_styles (supabase/migrations/20260413140000_email_ai_overhaul.sql:118-160) · app/api/cron/refresh-writing-styles/route.ts · profiles.activation_score (20260521000000_onboarding_redesign_additive.sql) · app/api/auth/callback/route.ts after().