Knowledge workers today lose institutional reasoning capital after critical decisions. When using structured frameworks like Paul-Elder, teams invest hours in rigorous analysis – challenging assumptions, weighing evidence, and mapping logic – only to have insights trapped in transient whiteboards or fading memory. This forces redundant rework during audits, creates version control nightmares for stakeholders, and obscures decision quality patterns. A 2024 Deloitte study found decision documentation consumes 31% of strategic meeting time (n=420 teams), yet 68% of critical premises remain unrecorded (source: "Cognitive Debt in Enterprises," Deloitte, Feb 2024).
This feature captures reasoning sessions as auditable artifacts. Business case: 8,000 active users (source: H1 2025 internal analytics) × 2.5 high-stakes decisions/user/month (source: user survey, n=1,200, May 2025) × $1,200 value/recovered decision-hour (source: blended knowledge worker cost + error reduction, McKinsey 2024 benchmarks) = $28.8M/year value. If adoption reaches 40%: $11.5M/year. Implementation cost: $310K (source: regional cost benchmarks - India engineering team).
This is an immutable reasoning ledger for critical decisions. This is not real-time collaboration, automated conclusion generation, or regulatory compliance documentation.
| Competitor | How They Solve This Job Today |
|---|---|
| Notion | Manual template documentation requiring discipline to maintain fidelity |
| Miro | Ephemeral whiteboard exports losing contextual annotations |
| Guru | Static wikis detached from live reasoning sessions |
| Capability | Notion | Miro | Houses of Thought |
|---|---|---|---|
| Structured framework-guided capture | ❌ | ❌ | ✅ (unique) |
| Auto-generated reasoning summaries | ❌ | ❌ | ✅ |
| Comparative decision quality scoring | ❌ | ❌ | ✅ |
| WHERE WE LOSE | Ecosystem integration | Visual flexibility | ❌ vs ✅ |
Our wedge is framework-native provenance because competitors retrofit documentation onto unstructured workflows.
| Metric | Measured Baseline |
|---|---|
| Decision rework due to lost context | 3.1 hrs/decision (n=47 incident reviews) |
| Stakeholder clarification requests | 8.2/week per team (source: Slack analytics) |
Value recoverable: 8,000 users × 3.1 hrs × $120/hr × 12 months = $35.7M/year.
Primary Metrics
| Metric | Baseline | Target | Kill Threshold | Measurement |
|---|---|---|---|---|
| Decision rework time | 3.1 hrs | ≤1.1 hrs | >2.0 hrs (D90) | Time-tracking |
| Audit trail adoption | 0% | 65% | <30% (D60) | Feature telemetry |
| Stakeholder trust score | 3.1/5 | 4.3/5 | <3.5 (D120) | CSAT survey |
Guardrail Metrics
| Metric | Threshold | Action |
|---|---|---|
| False attribution rate | >1% | Freeze AI extraction |
| P99 load time | >3.4s | Scale graph DB cluster |
What We Are NOT Measuring
- Total sessions captured (vanity - doesn't indicate value)
- Auto-summary word count (misleading - concision ≠ quality)
- Raw feature usage (fails to capture intentional adoption)
Risk 1 - Summary Distortion
- Probability: Medium | Impact: High
- Mitigation: Human-in-loop verification gate (owned by UX lead Priya - implemented Phase 1)
- Kill Criteria: >2% of summaries omit critical premises in D90 audit
Risk 2 - Compliance Gap
- Probability: Low | Impact: Critical
- Mitigation: GDPR Article 22 review by Legal (Anika) by 2025-10-15
- Contingency: If not cleared, disable EU data processing until resolution
Risk 3 - Adoption Friction
- Probability: High | Impact: Medium
- Mitigation: 1-click export to Notion/Confluence (owned by Dev Raj - target Phase 1.1)
- Kill Criteria: <30% of power users adopt by D60
Risk 4 - Vendor Lock-in
- Probability: Medium | Impact: Medium
- Mitigation: W3C Verifiable Credentials export (owned by CTO - Phase 1.2)
Pre-Mortem
It is 6 months from now and this feature failed because:
- Verification friction added 8min/session - users reverted to screenshots
- Legal blocked EU rollout due to "right to explanation" conflicts
- Competitor (Mural) launched AI whiteboard replay before Phase 1.1 integrations
Success looks like: Product leads reference past decisions during planning, auditors sample trails not raw data, and the CEO says "This finally makes our reasoning debt visible."
- Reasoning Graph Extraction: Convert free-form dialog into structured nodes (assumptions/evidence/gaps) with 95% entity recognition accuracy
- Argument Provenance: Link all claims to original session artifacts with cryptographic hashing
- Summary Fidelity: Generate executive briefs retaining 100% of critical premises (P0 - launch blocking)
- Anomaly Detection: Flag unresolved logical gaps with 99% precision (P1)
Rejected alternative: LLM-generated synthetic reasoning paths. Rationale: Violates core trust principle - audit trails must reflect actual human cognition.
Sources
- Voice transcripts (PII-redacted)
- Framework-specific annotations (e.g., "Assumption:" tags)
- User-highlighted evidence snippets
Storage
graph LR
A[Session Raw Data] -->|Immutable Write| B[IPFS CID]
B --> C[Structured Graph DB]
C --> D[Audit API]
D --> E[Client UI]
Critical Constraints
- Data minimization: Store only framework-relevant utterances
- EU Article 17 compliance: Full session deletability within 72hrs
P0 Test Suite (100% pass required)
| Test Class | Method | Target |
|---|---|---|
| Premise Integrity | Compare 50 human-labeled vs auto-captured sessions | 0% critical premise omission |
| Tamper Evidence | Inject edited session → verify audit trail mismatch | 100% detection rate |
P1 Continuous Monitoring
- Logical gap detection F1-score: ≥0.92 (measured weekly on 100 sampled decisions)
- Evidence misattribution rate: <0.5% (Pareto root-cause analysis)
Failure Protocol
If premise integrity falls below 100% in staging:
- Freeze model deployment
- Revert to rule-based capture
- Notify PM/legal within 1hr
┌─────────────────────────── Decision Audit Console ───────────────────────────┐
│ [Problem] Reduce customer churn in EMEA region │
├──────────────────────────────────────────────────────────────────────────────┤
│ Premise │ Source │ Status │ Verify │
│──────────────────────┼────────────────────┼──────────────┼───────────────────┤
│ Pricing not primary │ Session 00:12:45 │ ✅ Verified │ [Replay Context] │
│ churn driver │ │ │ │
│──────────────────────┼────────────────────┼──────────────┼───────────────────┤
│ Support latency >72h │ Zendesk export #45 │ ⚠ Unverified │ [Mark Incorrect] │
├──────────────────────────────────────────────────────────────────────────────┤
│ [GAP] Churn survey data not reconciled with CRM trends │
└──────────────────────────────────────────────────────────────────────────────┘
Four-Eyes Principle
- All high-impact decisions (defined by $ value/regulatory scope) require:
👤 Decision owner validation + 👥 Framework moderator countersignature
- Temporal Provenance
- All session elements stored with NTP-synced timestamps (±50ms)
- Inviolability Controls
- Write-once storage with Merkle root verification
- Session edits append new version with diff highlighting
- Access Governance
- RBAC with decision-specific permissions (e.g., "view evidence but not assumptions")
Compromise Protocol
If cryptographic chain breaks:
- Auto-lock affected sessions
- Notify security team + impacted users within 15min
- Forensic audit within 24hrs