Enterprise procurement teams using Ikigai's forecasting models detect 3–7 critical supply disruption signals per month (weather events blocking port access, geopolitical sanctions on supplier regions, sudden demand spikes from retail channels) — but the model output is a probability distribution, not a decision. A procurement manager at a mid-market CPG company spends 2.8 hours per signal manually translating the model's forecast delta into a decision brief: interpreting confidence intervals, calculating inventory exposure, drafting procurement action options (expedite orders, shift suppliers, adjust safety stock), and building a risk scenario table for VP approval. At 5 signals/month × 2.8 hours × $72/hr blended procurement cost × 12 months × 38 Ikigai enterprise customers with procurement personas = $464K/year in recoverable decision prep time (source: Customer Success time-tracking survey Q4 2024, n=12 customers; HR compensation data for procurement manager role; customer count from CRM). If only 50% of customers adopt this workflow: $232K/year. This excludes the downstream value of faster procurement decisions — every 24-hour delay in placing an expedited order during a disruption costs an average $18K in lost sales or expedite premiums (source: 2024 Gartner Supply Chain survey, n=340 enterprises).
This feature auto-generates a structured decision brief the moment Ikigai's forecasting model detects a material signal (demand spike >15% from baseline, supply delay >5 days, external disruption event matched in knowledge graph). The brief includes: impact estimate in revenue-at-risk and inventory exposure, 2–3 recommended procurement actions ranked by cost/speed trade-off, 3 risk scenarios (do nothing, partial mitigation, full mitigation) with cost and probability, model confidence score with explanation, and a one-click approval workflow that routes to the VP with full audit trail. The business case: 38 customers × 5 signals/month × 2.8 hrs × $72/hr × 12 months = $464K/year recoverable time (source: assumptions above). Downside case at 50% adoption: $232K/year. The larger opportunity is contraction prevention: if this feature prevents 2 customers/year from churning because "the model outputs are too hard to action" (stated reason in 3 of 8 enterprise churn exit interviews, 2024), we retain $340K ARR (avg enterprise contract).
This feature is an auto-generated decision brief with procurement action recommendations, risk scenario modeling, and approval workflow — triggered by Ikigai's existing forecasting model signals. It is not an autonomous procurement execution system (no auto-PO creation, no supplier contract negotiation), not a replacement for ERP procurement workflows (recommendations route to human approval, then execution happens in SAP/Oracle/NetSuite), and not a general "explain any model output" tool (scoped to supply disruption signals only — demand forecasts, inventory optimization, and pricing models are out of scope for Phase 1).
NORTH STAR METRIC:
Brief approval rate: The percentage of auto-generated briefs that are approved (user clicks "Approve" and routes to VP or ERP) within 7 days of generation. This metric proves the feature is trusted and actionable.
TARGET: ≥60% approval rate at D90 (minimum viable trust threshold — if users reject >40% of briefs, the feature is not sufficiently useful to justify maintenance cost).
PRIMARY METRICS (PROVE THE PROBLEM IS SOLVED):
| Metric | Baseline | Target (D90) | Kill Threshold | Measurement Method | Owner |
|---|---|---|---|---|---|
| Brief approval rate | N/A (new feature) | ≥60% | <40% at D90 → full retrospective, consider pivot or kill | Mixpanel event tracking: brief_approved / brief_generated | PM (Sarah) |
| Time-to-decision (median) | 2.8 hours manual assembly (n=12, Q4 2024 survey) | ≤15 minutes | >45 minutes at D90 → feature is not materially faster, pause Phase 2 | Delta between brief_generated.timestamp and user_decision.timestamp | Eng (Priya) |
| User action rate | 61% of manual briefs result in action within 48 hrs (Q4 2024 survey) | ≥75% of auto-briefs result in action within 24 hrs | <65% at D90 → feature is not increasing decision velocity | user_decision.timestamp < 24 hrs after generation | PM (Sarah) |
| Calibration error (quarterly) | N/A (new feature) | <10 percentage points | >20pp for 2 consecutive quarters → disable confidence scores until recalibrated | |model_confidence - observed_stockout_rate| across all briefs with outcome data | Eng (Priya) |
DECOMPOSED INPUT METRICS (WHAT DRIVES APPROVAL RATE):
| Input Metric | Hypothesis | Target | Measurement |
|---|---|---|---|
| Data completeness rate | Users trust briefs with complete data more than partial briefs | ≥90% of briefs have all required inputs (no data gaps) | data_gaps field in briefs table = empty |
| Recommendation relevance | Users approve briefs where ≥1 recommended action is feasible | ≥80% of briefs have ≥1 action rated "relevant" by user (post-approval survey) | User survey: "Were the recommended actions feasible? [Yes/Somewhat/No]" |
| Confidence clarity | Users understand what the confidence score means | ≥70% of users can correctly explain confidence score (tested via onboarding quiz) | Onboarding quiz: "What does an 85% confidence score mean? [multiple choice]" |
GUARDRAIL METRICS (MUST NOT DEGRADE):
| Guardrail | Threshold | Action if Breached |
|---|---|---|
| Overall forecast model accuracy | Must remain ≥90% (existing SLA) | If forecast accuracy drops below 90% for 2 consecutive weeks, pause brief generation and investigate model degradation |
| Customer churn rate (enterprise segment) | Must remain |
OBJECTIVE: When Ikigai's forecasting model detects a material supply chain disruption signal (demand anomaly >15% from 30-day baseline, supplier delay >5 days from committed date, or external event matched in knowledge graph), generate a decision brief within 45 seconds that a procurement manager can review in <8 minutes and forward to VP approval — compared to the current 2.8-hour manual assembly process.
WHAT THE MODEL MUST DO:
The system integrates three inputs:
The generated brief must include:
WHAT THE MODEL DOES NOT DO:
CALIBRATION REQUIREMENT:
The model confidence score must be empirically calibrated: if the model says "85% confidence this disruption will cause a stockout," then across all 85%-confidence predictions, we observe a stockout ~85% of the time. Calibration is measured quarterly using a hold-out set of disruption signals from the past 90 days. If calibration error >10 percentage points (e.g., 85% predictions result in stockouts only 70% of the time), we surface a warning banner in the UI: "Model confidence scores are currently under review — treat recommendations as directional."
FALLBACK BEHAVIOR:
If any required input is missing (e.g., customer has not connected ERP inventory data, supplier lead time data is stale >30 days, external disruption event has no matched knowledge graph entry), the system generates a partial brief with a clear "Data gaps" section listing what's missing and what the user must manually verify. The brief is still generated (so the user knows a signal was detected), but the "Approve" button is disabled until the user acknowledges the data gaps and manually overrides.
TRAINING DATA:
This feature does not train a new ML model. It uses Ikigai's existing forecasting model (already trained on customer historical demand/supply data) and adds a deterministic post-processing layer (impact calculation + action ranking). The only "learning" component is calibration: we log every generated brief, the user's decision (approve/reject/defer), and the actual outcome (did a stockout occur? was revenue impacted?). This outcome data is stored for quarterly calibration checks and future Phase 2 ML ranking.
REQUIRED DATA INPUTS (per customer):
| Input | Source | Update Frequency | Quality Requirement |
|---|---|---|---|
| Demand forecast (with confidence interval) | Ikigai forecasting model | Real-time (every model run) | Must include delta from baseline + probability distribution |
| Current inventory levels (by SKU) | Customer ERP (SAP, Oracle, NetSuite) | Daily batch sync | Must be <24 hrs stale; if >48 hrs stale, surface warning |
| Supplier lead times (by supplier + SKU) | Customer supplier master (uploaded to Ikigai or synced from ERP) | Weekly batch sync | Must include historical avg + current committed lead time; if no data, use industry default (14 days) with warning |
| Supplier pricing (base + expedite premium) | Customer supplier master | Monthly batch sync | If missing, use placeholder "$[expedite cost not configured]" and disable cost ranking |
| External disruption events (weather, geopolitical) | Third-party API (e.g., Everstream Analytics, Resilinc) | Real-time webhook | If API unavailable, degrade to "external event detected (details unavailable)" |
| Historical procurement decisions + outcomes | Ikigai audit log (new table: procurement_decisions) | Real-time (logged at approval) | Captures: brief ID, user decision, approval timestamp, actual stockout (boolean, backfilled after 30 days) |
DATA GAPS & MITIGATION:
procurement_decisions.actual_outcome. Target 60% response rate (based on analogous feedback surveys in other Ikigai features).SYNTHETIC DATA FOR TESTING:
We will generate 50 synthetic disruption scenarios (10 scenarios × 5 risk levels) using anonymized historical data from 3 design partner customers. Each scenario includes: baseline forecast, disrupted forecast, inventory snapshot, supplier lead times, and expected brief output. Engineering uses this dataset for regression testing (every code change must produce identical briefs for the 50 scenarios). Design partners review the briefs for realism during Beta.
EVALUATION OBJECTIVE:
Prove that the auto-generated brief is (a) factually correct (numbers match source data), (b) actionable (procurement manager can make a decision in <8 minutes), and (c) trusted (user approves the recommendation ≥60% of the time, or provides a reason for rejection that informs Phase 2 improvements).
PHASE 1 EVALUATION (PRE-LAUNCH):
Test 1 — Factual Correctness (Offline Eval)
Test 2 — Actionability (Design Partner Beta, n=3 customers)
Test 3 — Trust Calibration (Design Partner Beta)
PHASE 2 EVALUATION (POST-LAUNCH, ONGOING):
Metric 1 — Approval Rate (Primary Success Metric)
brief_generated, brief_approved, brief_rejected, brief_deferred)Metric 2 — Time-to-Decision (Efficiency Metric)
brief_generated.timestamp and user_decision.timestamp in databaseMetric 3 — Calibration Error (Trust Metric)
Metric 4 — Override Pattern Analysis (Phase 2 Input)
WHAT WE ARE NOT MEASURING (AND WHY):
HUMAN OVERSIGHT ARCHITECTURE:
This feature is a recommendation system with mandatory human approval — no procurement action is executed without explicit user approval. The human is always in the loop for three critical decision points:
Decision Point 1 — Brief Review & Approval (Procurement Manager)
brief_id, user_id, action, timestamp, comment). Audit log is exportable for compliance reviews.Decision Point 2 — VP Approval (Optional, Customer-Configurable)
Decision Point 3 — Outcome Feedback (30-Day Follow-Up)
ESCALATION PATH (MODEL FAILURE MODE):
If the brief generation fails (e.g., ERP integration timeout, external API unavailable, model output is NaN), the system does NOT silently fail. Instead:
OVERRIDE TRANSPARENCY:
If a user consistently rejects recommendations (e.g., user rejects ≥70% of briefs over 30 days), the system surfaces a banner: "We've noticed you're rejecting most recommendations. Would you like to adjust your risk tolerance settings or provide feedback to improve future briefs?" This prevents silent model-user misalignment.
OBJECTIVE: Ensure the auto-generated brief is transparent, auditable, and safe — users trust the recommendation enough to act on it, and the company can defend the recommendation in a post-incident review if a procurement decision goes wrong.
GUARDRAIL 1 — CONFIDENCE SCORE TRANSPARENCY
Every brief includes a confidence explanation (not just a number):
GUARDRAIL 2 — DATA PROVENANCE (SHOW YOUR WORK)
Every number in the brief must cite its source:
provenance object for every calculated field (source table, timestamp, calculation formula). Frontend renders this in an expandable "Show calculation" accordion under each number.GUARDRAIL 3 — DATA STALENESS WARNINGS
If any input data is stale (>24 hrs for inventory, >7 days for supplier lead times, >30 days for pricing), the brief surfaces a warning banner:
GUARDRAIL 4 — ASSUMPTION TRANSPARENCY
Every brief includes an "Assumptions" section listing what the model assumes to be true:
GUARDRAIL 5 — AUDIT TRAIL (EVERY DECISION IS LOGGED)
Every brief generated, every user decision (approve/reject/defer), and every outcome (stockout occurred or not) is logged in an immutable audit table:
procurement_decisions table with columns: brief_id, user_id, generated_at, decision (approved/rejected/deferred), decided_at, vp_approved_at (if applicable), rejection_reason (free text), actual_outcome (boolean: stockout occurred), outcome_reported_atGUARDRAIL 6 — RATE LIMITING (PREVENT ALERT FATIGUE)
If the model generates >10 disruption briefs for a single customer in a 7-day window, the system surfaces a banner: "High disruption signal volume detected. Review model sensitivity settings or contact support." This prevents two failure modes: (a) model is miscalibrated and firing false positives, (b) user ignores briefs due to alert fatigue.
INFERENCE TRIGGER:
Brief generation is triggered by Ikigai's existing forecasting model detecting a material signal. The forecasting model runs on a scheduled cadence (customer-configurable: hourly, daily, or on-demand). When the model detects:
...the model publishes a message to a Kafka topic (forecast_disruption_signals). The brief generator service subscribes to this topic and processes each signal asynchronously.
EXPECTED LOAD:
LATENCY REQUIREMENT:
Brief must be generated within 45 seconds of signal detection (p95 latency). Breakdown:
SCALING STRATEGY:
CACHE STRATEGY (REDUCE LATENCY):
FAILURE MODE MITIGATION: