← Index
Wireframe In Progress

Validate · Hypothesis Detail

Round 1 Created 2026-05-26 Section Validate Product quanta Anchor H-1 · AI Intake Agent Showcase Validate journey-phase spine

Discover · PortfolioH-1Validate

H-1 · AI Intake Agent · Mission Engineering

IO · AI · brownfield · Acquisitio Defense · gate Thu (2 days)

LeadCFOCTOCEOCCO
Value hypothesis · hire statement

A Mission Engineering technician at Acquisitio hires this to triage inbound capability requests in <2 hours instead of the current 14-day queue, because the legacy ticket-routing system buries requests behind manual classification — and the queue depth is rising as DoD program tempo accelerates.

Engagement constitution Cost AVOIDANCE only — no displacement CTR target ≥ 0.40 AI-risk gate must clear 80% Transfer-Posture verified before Scale ITAR-aware throughout
⬤ Persevere met ✓
Accuracy ≥ 85% AND latency < 2h on held-out eval
current88%
threshold mark at 85%+3 pts above
committed 2026-05-03 · before MVP ran
◐ Pivot not triggered
Accuracy 60–85% OR HITL load > 40% — pivot on bucket re-diagnosis
HITL load · current8%
HITL threshold at 40%far below
committed 2026-05-03 · before MVP ran
○ Kill not triggered
Accuracy < 60% OR domain-sparsity probe fails — out-of-regime
accuracy floor60%
current 88% > floorcleared
committed 2026-05-03 · before MVP ran

Engagement Diagnostic · GPT-3 eval-harness logic (Brown et al. 2020) applied

Last run 2h ago · 247-task held-out eval · Mission-Eng domain

Prompt-complexity tier sweep

Tier Accuracy Lat Cost HITL (a) zero-shot 41% 0.8s $.001 12% (b) few-shot 88% ↑ +47 1.2s $.003 8% (c) tool-augmented 90% +2 2.1s $.012 6% (d) extended-think 91% +1 4.5s $.024 5%
Curve shape · steep at (a)→(b), flat after

Marginal returns to prompt complexity collapse after exemplar-rich few-shot. This is Bucket 1 · Prompting failure — the model has the capability, the deployment was under-exemplar'd. Tier (b) lift confirms.

Failure-mode bucket · 6-way classification

1
Prompting failure
⬤ this case
2
Context / RAG failure
eliminated
3
Tool-design failure
eliminated
4
Harness / orchestration failure
eliminated
5
Capability gap · domain sparsity
eliminated
6
Evaluation methodology failure
eliminated
Asset Library fix · prompting-pattern-v2 (exemplar harvest) Pivot if curve doesn't shift after fix · trace lineage available

4-doc PLAN · Business Case / Market Validation / Development / Scaled GTM

75% authored · 1 expanded
Business Case FY27 modeled cost avoidance $450k/yr · 3 engineers not hired · payback 4 months 12 sections
updated 2d ago
Market Validation 3 stakeholder interviews · 1 concierge subject converted · evidence ledger linked 8 sections
updated today
2.1 Hire statement ✓ approved 2.2 Persona evidence ✓ 3 of 3 2.3 Concierge MVP log in flight 2.4 Workflow observability ✓ approved 2.5 Failure-mode survey in flight 2.6 Adjacent-pattern review drafted 2.7 Anti-pattern catalog drafted 2.8 Validation summary pending close
Development Reference architecture · HRA-1 baseline · ITAR-aware deployment posture · ATO pre-work in flight 15 sections
60% complete
Scaled GTM Cohort-pattern catalog seed · capability-transfer plan · partner-interface placeholder drafted
updated 5d ago
Suggested next action

Promote H-1 to Pilot before Thursday gate

Persevere threshold met (88% > 85%). Bucket 1 fix applied; curve shifted +47 points. 4-doc PLAN at 75%. Only outstanding gate: ultra AI-risk coverage.
Gate state  ·  persevere   CTR target ✓ 0.42   AI-risk coverage ⚠ 71% (need 80%)   Transfer-Posture ✓ ready