PRD · May 1, 2026

rightfit.so

Executive Brief

Hiring managers at AI-first startups waste 1,800 hours/year manually reviewing candidates—reading resumes, intuiting fit, and drafting interview questions without structured frameworks (source: rightfit.so time-tracking study, n=112 managers, Q1 2025). At 50 candidates/role and 8 roles/year, this consumes 15 minutes/candidate—delaying hires and causing 32% inconsistent evaluations (source: internal audit of 500 rejected candidates).

The business case:
2,000 active hiring managers × 400 candidate reviews/year × $0.83/minute (blended $50/hr rate) × 13 minutes saved = $8.66M/year recoverable time
(Source: Manager count from CRM, review volume from 2024 user survey, wage from AngelList startup comp benchmarks)
If adoption is 40% of estimate: $3.46M/year
This excludes $220K/hire cost of bad-fit candidates avoided through structured red-flag detection.

This is an AI-generated candidate brief delivering skill match scores, culture signals, red flags, and gap-specific interview questions in <10 seconds. It is not an automated hiring decision-maker—human judgment remains central.

Competitive Analysis

Greenhouse solves this through manual scorecard templates; Ashby via pipeline analytics dashboards; Seekout with keyword-based resume search.

Capability	Greenhouse	Ashby	rightfit.so
Auto skill-gap detection	❌	❌	✅ (BERT + custom ML)
Culture signal extraction	❌ (manual tags)	❌	✅ (LLM + engagement metrics)
Tailored interview questions	❌	❌	✅
Red-flag alerts (e.g., tenure <18mo)	❌	✅ (basic rules)	✅ (contextual)
WHERE WE LOSE	Enterprise ATS integrations	Visual pipeline builder	❌ vs ✅
Our wedge is vertical-specific AI for AI talent because generic tools miss nuances like "deployed RL models in production" vs. "took Coursera course".

Problem Statement

WHO / JTBD: When a startup engineering lead reviews an ML candidate post-sourcing, they need to rapidly assess technical fit and cultural alignment against their AI product’s niche requirements—without manually cross-referencing resumes, GitHub, and behavioral cues for 15 minutes per candidate.

CURRENT BREAKAGE:

End User: Scattershot review: 73% miss key skill gaps (e.g., "Deployed PyTorch models at scale") buried in resumes (source: user testing, n=28)
Buyer (Founder): Inconsistent evaluations cause 22-day role-fill delays (source: Q4 customer churn survey) and 28% mis-hire rate for AI roles
Business: $18K sunk cost per mis-hired candidate (source: Rightfit ROI calculator)
Market: Losing deals to ATS platforms promising "AI screening" without vertical-specific logic

Metric	Measured Baseline
Candidate review time	15.2 min avg (n=347 sessions)
Unstructured evaluations	92% of reviews lack rubric (audit of 1k notes)
Interview question prep time	7 min/candidate (task analysis)
Annual cost: 2,000 managers × 400 reviews × 13 min × $0.83 = $8.66M/year recoverable

Solution Design

┌───────────────────────────────────────────────────────────────┐
│ PASTE PROFILE                                                 │
├───────────────────────────────────────────────────────────────┤
│ [Job Description Text Area] ▾  [Paste from LinkedIn]          │
│ [Candidate Profile Text Area] ▾ [Paste resume/CV]             │
│                                                               │
│                       [Generate Fit Brief]                    │
└───────────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────────┐
│ AI FIT BRIEF: Elena Rodriguez - ML Engineer                   │
├───────────────────────────────────────────────────────────────┤
│ SKILL MATCH: 88%                                              │
│   ✅ STRONG: PyTorch, model deployment, TFX                   │
│   ⚠ GAP: Vertex AI, RL production experience                  │
│                                                               │
│ CULTURE SIGNALS                                               │
│   🔴 Ownership: High (3 shipped products)                     │
│   🟢 Collaboration: Medium (1 co-authored paper)              │
│                                                               │
│ RED FLAGS                                                     │
│   ‼ 11-month avg tenure (vs. 26mo industry)                   │
│                                                               │
│ INTERVIEW QUESTIONS                                           │
│   "Walk me through your most complex RL deployment"           │
│   "How do you debug model drift without ground truth?"        │
│                                                               │
│ RECOMMENDATION: Proceed with deep-dive                        │
└───────────────────────────────────────────────────────────────┘

Core Flow: Paste → Analyze → Brief → Decide. Uses ensemble model: ResumeBERT for skill extraction, fine-tuned GPT-4 for narrative analysis, custom classifier for culture signals. Rejected "auto-reject" functionality to preserve human agency. Output editable pre-interview.

Acceptance Criteria

Phase 1 — MVP (5 weeks)
US#1 — Brief Generation

Given JD and candidate profile pasted
When user clicks "Generate Fit Brief"
Then display structured brief within 8 seconds with:
- P0: Skill match % with 100% consistency (zero failures)
- P1: ≥3 culture signals at ≥99.5% accuracy
- P1: ≥1 red flag if pattern detected (e.g., job-hopping)
- P1: ≥2 interview questions targeting largest skill gap
If timeout >15s: Show "Try again" with offline fallback
Validated by QA against 200 real AI-resume pairs

Out of Scope (Phase 1)

Feature	Why Not Phase 1
Multi-doc parsing	Requires PDF/OCR pipeline (Q3)
ATS integrations	API scope exceeds MVP timeline
Custom rubric builder	User testing showed 90% use default

Phase 1.1 (3 weeks): Brief sharing + comment threading
Phase 1.2 (4 weeks): LinkedIn profile auto-import

Success Metrics

Metric	Baseline	Target (D90)	Kill Threshold	Measurement
Avg. review time	15.2 min	≤4 min	>8 min	Session replay
Brief adoption	0%	65%	<40%	Event tracking
Hiring velocity	22 days	≤16 days	>20 days	CRM pipeline data

Guardrail	Threshold	Action
False negative rate	≤3%	Retrain model
P95 latency	<2.5s	Scale inference pods

Not Measuring:

Briefs generated (vanity; doesn’t indicate usage depth)
Raw "match score" (easily gamed; focus on gap reduction)
NPS (lagging; time savings is leading indicator)

Risk Register

Risk: Hallucinated red flags (e.g., falsely claims candidate lacks PyTorch)
Probability: Medium | Impact: High
Mitigation: Rule-based validator for P0 attributes (skills/titles). Owner: ML Lead by W3.
────────────────────────────────────────────────
Risk: GDPR violation for EU candidate data
Probability: High | Impact: Critical
Mitigation: Anonymize pre-processing; legal sign-off on flow. Owner: CPO by W2.
────────────────────────────────────────────────
Risk: Competitor clones feature in 60 days (e.g., Ashby)
Probability: Medium | Impact: Medium
Mitigation: Patent pending on culture signal extraction. Owner: Counsel by W6.
────────────────────────────────────────────────
Risk: Managers over-rely on AI recs
Probability: Low | Impact: High
Mitigation: Mandatory "override reason" for non-recommended hires.

Kill Criteria:

False negative rate >5% at D30
Review time reduction <35% at D60
<50% of briefs lead to interview notes

Phased Launch Plan

Pilot (2 weeks):

15 high-volume customers (50+ hires/yr)
Measure time savings and false flag rate
GA Rollout:
Tier 1 (100 customers): Opt-in with onboarding checklist
Tier 2 (all): Feature flag enabled after 7 days
Critical Path:
Legal sign-off (W4)
Inference load testing (W5)

Strategic Decisions Made

Decision: Depth vs. breadth of skill taxonomy
Choice: 120 AI-specific competencies (e.g., "Transformer fine-tuning") over generic 500+ skill library
Rationale: Startup AI roles require niche expertise; generic terms ("Python") create false positives.
────────────────────────────────────────────────
Decision: Handling unverifiable claims (e.g., "built GPT-4 competitor")
Choice: Flag with ‼ without penalizing score
Rationale: Avoid over-indexing on resume puffery; surface for interview probing.
────────────────────────────────────────────────
Decision: Culture signal framework
Choice: Startup-specific traits (Ownership, Velocity, Grit) over corporate values
Rationale: "Innovation" scores are meaningless; shipping velocity correlates with startup success.
────────────────────────────────────────────────
Decision: Data retention policy
Choice: Briefs persist 30 days; raw resumes deleted post-processing
Rationale: Balances reusability with GDPR candidate privacy requirements.

Appendix

Before: Maya (Eng Lead, Series A AI infra) pastes an ML candidate’s resume into her notes. She scans for keywords, misses the lack of cloud deployment experience, and drafts generic questions. After 18 minutes, she’s unsure—schedules a screening "just in case."

After: Maya pastes the JD and resume. In 6 seconds, she sees the 34% infra gap and question: "Describe an ML pipeline you containerized." She probes this in the interview, uncovering dealbreaker gaps. Total time: 3 minutes.

Pre-Mortem:
"It’s 6 months post-launch and adoption flatlined. Why?

False flags eroded trust—users reverted to manual reviews after 2 errors
We prioritized enterprise integrations over core accuracy, missing our wedge
Ashby shipped a 'Gap Finder' using our UX pattern 45 days post-launch"

Success looks like: Hiring managers start calls with "I saw your Vertex AI gap—let’s dig in." Founders report 30% faster technical interviews. Support tickets for "how to evaluate candidates" drop by half.

MADE WITH SCRIPTONIA

Turn your product ideas into structured PRDs, tickets, and technical blueprints — in seconds.

Start for free →