Current Day 4 architecture

Right now:

Document/PDF
→ Gemini extraction
→ ClaimExtraction JSON
→ deterministic validation rules
→ COMPLETED or NEEDS_REVIEW

Example rule:

If lossType = theft and FIR number is missing
→ NEEDS_REVIEW

This is hardcoded because you already know theft claims need FIR/police evidence.

This is good for Week 1.


What happens later with 1000s of documents?

With many documents, you will discover new patterns.

For example, assume your current validator says:

own_damage claim + no repair estimate
→ maybe still okay

But after 1000 documents, human reviewers repeatedly mark these as NEEDS_REVIEW.

The system can learn:

Pattern:
own_damage claims without repairEstimate are often sent to review.

Evidence:
87 out of 92 similar cases were marked NEEDS_REVIEW by human reviewers.

Suggested rule:
If lossType = own_damage and supportingDocuments.repairEstimate = false
→ NEEDS_REVIEW

But the system should not silently deploy that rule. It should create a rule suggestion.


Future architecture

Layer 1 — AI extraction

Gemini extracts facts: