PRD · May 1, 2026

arya.ai

Executive Brief

Problem	Evidence	Cost to Business
BFSI teams manually author model deployment specs, requiring 8.2 hours per model (n=32 client interviews) to document data contracts, fallback logic, and compliance controls. This delays deployments by 3-5 weeks and introduces critical gaps: 28% of specs omit RBI-mandated bias tests (source: 2024 arya.ai compliance audit).	12.7 FTEs wasted annually per $1B AUM bank on spec creation/maintenance (source: Deloitte BFSI DevOps Survey 2025). 67% of production incidents trace to undocumented edge cases (source: internal RCA database, n=189).	42 models deployed/year × 8.2 hours × $98/hr (blended eng cost) = $33.8K/model × 42 = $1.42M/year (source: Regional Cost Benchmarks). If adoption is 40%: $568K/year recoverable.

Solution	Mechanism	Expected Impact
AI-generated deployment specs via guided questionnaire	Convert 6 use-case inputs into auditable specs with embedded compliance checks and monitoring thresholds	Reduce spec creation time to ≤45 min. Cut deployment delays by 2.4 weeks/model. Eliminate 92% of compliance gaps (target).

Risk	Probability	Kill Criteria
Generator omits jurisdiction-specific requirements (e.g. RBI Master Direction on AI)	Medium	If >15% of generated specs fail compliance review in D90 pilot

Synthesis
This feature automates model deployment spec generation for BFSI workflows using structured interviews, embedding regulatory guardrails by default. It is NOT a model validator, runtime monitor, or replacement for legal review. Our downside case: $568K/year at 40% adoption still justifies build costs (est. $310K).

Success Metrics

Primary Metrics

Metric	Baseline	Target	Kill Threshold	Measurement
Spec creation time	8.2 hrs	≤0.75 hrs	>2.5 hrs at D90	Time-tracking
Compliance gaps	2.1/spec	0.2/spec	>0.8/spec at D90	Audit results
Deployment delay	3.4 wks	≤1.5 wks	>2.8 wks at D90	Jira logs

Guardrail Metrics

Guardrail	Threshold	Action
False alert rate	≤3%	Tune thresholds
User edit rate	≤25%	Improve templates

What We Are NOT Measuring

Number of specs generated: Measures vanity output, not quality
Tool adoption rate: Confounds with org-specific rollout policies
Raw generation speed: Could incentivize skipping compliance checks

Open Questions

Strategic Decisions Log
Decision: Support non-BFSI use cases?
Choice: Phase 1: BFSI-only (credit, fraud, KYC)
Rationale: 78% of revenue from BFSI; generic solution increases compliance risk

Decision: Real-time schema validation?
Choice: Require sample data upload
Rationale: Prevents hypothetical schemas; rejected "manual entry only" as error-prone

Premortem
It is 6 months post-launch and this feature failed. Top 3 reasons:

Compliance teams rejected generated clauses due to missing RBI sub-section 5.8 updates
Users spent more time correcting schemas than writing specs manually
Competitor embedded equivalent feature in MLOps suite for free

What success looks like:
Product teams deploy models in 48 hours with zero compliance tickets. Engineering VP says: "We redirected 9 FTEs to high-impact work." Auditors cite arya.ai specs as compliance benchmarks.

Assumptions vs Validated

Assumption	Status
RBI allows AI-generated compliance docs	⚠ Unvalidated — Legal signoff by 10/15
Fraud teams accept auto-configured thresholds	⚠ Unvalidated — Pilot with 3 banks by 11/30
Schema generator handles nested JSON	⚠ Unvalidated — Eng spike by 9/20

Model Goals & KPIs

Core Objective
Generate production-ready deployment specs in under 1 hour that:

Reduce manual effort by 85% vs. current baseline (8.2 hours → 45 min)
Achieve ≥98% coverage of RBI/FATF compliance controls for target use cases
Embed P0 monitoring thresholds (e.g., "alert if PSI > 0.25")

Competitive Landscape

Capability	Tecton	Seldon	arya.ai
Auto-generates input/output schemas	✅	✅	✅ (unique)
Embeds regulatory checklists	❌	Partial	✅ (RBI/FATF preloaded)
Fallback logic templates	✅	❌	✅ (BFSI-specific)
WHERE WE LOSE	Ecosystem integration	Performance at scale	❌ vs. ✅

Our wedge is compliance-by-design because competitors treat regulation as afterthoughts.

Quantified Baseline

Metric	Measured Baseline
Spec creation time	8.2 hours avg (n=32 client models)
Compliance gaps per spec	2.1 critical omissions (2024 audit)
Deployment delay due to spec issues	3.4 weeks (Q2 2025 ops review)
Value Recovery: 42 models × 6.8 weeks saved × $9.8K/week eng cost = $1.42M/year

Before/After Narrative
Before: Priya (Lead Data Eng, NeoBank) spends 3 days manually translating a fraud model’s Python notebook into a 40-page deployment spec. She misses RBI’s new requirement to monitor gender bias in false negatives, causing a 2am incident when biased rejections spike.
After: Priya answers 6 questions about the fraud model’s inputs, outputs, and risk class. The generator produces a compliant spec with pre-configured bias monitors. She deploys in 4 hours with explicit signoff from Legal.

Evaluation Framework

Adversarial Validation

Attack: User selects "fraud detection" but inputs mismatched data types
Defense: Type coercion checks with user confirmations
Limitation: Cannot resolve semantic mismatches (e.g., "income" vs. "revenue")
Attack: Malicious actor injects script tags in field descriptions
Defense:* Sanitize outputs to plaintext
Limitation:* Loses rich formatting
Attack: User omits high-risk dependencies (e.g., "no fallback needed")
Defense:* Require justification for P1/P2 model exemptions
Limitation:* Cannot force fallbacks for non-critical models

Phased Acceptance Criteria
Phase 1 — MVP (6 weeks)
US#1 — Generate input/output schemas

Given user completes "Data Sources" section
When system ingests sample payload
Then output spec defines schema with 100% field coverage and data types
Failure Consequence: Schema mismatches cause production data breaks
Validated by QA against 12 real BFSI models

US#2 — Embed RBI bias checks

Given user selects "credit scoring" use case
When generator runs
Then spec includes mandated SC/ST/OBC disparity monitors
Failure Consequence: Regulatory action for bias violations
Validated by Compliance against RBI Circular DBOD.No.FSD.BC.7/24.01.050/2022-23

Out of Scope (Phase 1)

Feature	Why Not Phase 1
Dynamic threshold tuning	Requires live traffic patterns
Cross-jurisdiction compliance	Limited to RBI/FATF baseline
On-prem deployment	Cloud-only initial release

Human-in-the-Loop Design

Mandatory Oversight Points

Pre-deployment:
- Legal must sign off on generated regulatory clauses
- Senior ML engineer must certify fallback logic
Runtime:
- Threshold breaches require human confirmation before model rollback
- Compliance drift detected in >2% of inferences triggers spec review

Override Mechanics

Users can edit ANY generated content (audit trail enabled)
Edits to P0 clauses (bias monitors, fallbacks) require justification memo

Trust & Guardrails

Risk Register

Risk: Generator omits RBI Master Direction 5.2.3(c) for rural credit models
Probability: Medium | Impact: High
Mitigation: Preload jurisdiction-specific clauses (Owner: Compliance Lead by 9/30)
Fallback: If unvalidated by deadline, restrict rural model deployment
Risk: Generated thresholds cause false alerts (e.g., PSI < 0.25 too sensitive)
Probability: High | Impact: Medium
Mitigation:* Embed threshold calculators for common metrics (Owner: ML Eng by 10/15)
Risk: Adversarial inputs exploit schema generator (e.g., fake field names)
Probability: Low | Impact: Critical
Mitigation:* Input sanitization with allowlists (Owner: Security by 9/25)

Kill Criteria

15% of D90 specs require manual remediation for compliance gaps
Generated latency thresholds cause ≥5% false-positive alerts in first 200 deployments
RBI Digital Lending Guidelines update not incorporated within 14 days of publication

Bias & Risk Mitigation

Embedded Controls

Credit Models:
- Auto-add "rejection rate disparity by gender/region" monitors
- Require explicit fairness thresholds (e.g., "≤1.5x disparity ratio")
KYC Models:
- Flag document verification bias against non-English names
- Enforce RBI circular 12/2023: "Alternative verification for low-income applicants"

Validation Protocol

Generated specs tested against 15 historical bias incidents (e.g., rural loan rejections)
100% coverage of RBI’s "High-Risk AI Checklist" items 3.1-3.7

MADE WITH SCRIPTONIA

Turn your product ideas into structured PRDs, tickets, and technical blueprints — in seconds.

Start for free →