Upload PDF/email creates run
✅ Run extraction works
✅ Gemini saves extractedJson
✅ Status moves to VALIDATING
✅ Run validation works
✅ App saves validationJson and missingFieldsJson
✅ Status becomes COMPLETED or NEEDS_REVIEW
✅ UI shows validation summary, missing fields, conflicts/warnings, required evidence
✅ Timeline proves extraction + validation steps
✅ All 5 synthetic PDFs tested
✅ README updated
Upload claim PDF/email
→ extract structured claim JSON with Gemini
→ validate extracted data with deterministic rules
→ detect missing fields, conflicts, warnings, and required evidence
→ save validation result in Postgres
→ mark run as COMPLETED or NEEDS_REVIEW
→ show validation summary and timeline in UI

Day 4 Implementation

image.png

image.png

Preparing Synthetic Data

Converted .md files to .pdf

cd claimflow_ai

for file in sample-data/auto-insurance/v1/source-docs/synthetic/*.md; do
  bunx md-to-pdf "$file"
done

Optional later improvement

Later, for Day 6 polish, you can add a sample document selector in the UI:

Choose sample:
- valid-own-damage-claim.pdf
- missing-policy-number.pdf
- repair-estimate-only.pdf
- third-party-without-police-report.pdf
- theft-claim-missing-fir.pdf

But internally it should still create a normal upload/run record. The sample selector should not bypass your workflow.

For now, simplest and correct path:

Keep sample-data at root.
Convert markdown to PDFs there.
Upload those PDFs manually through your app.
Let the app copy them into apps/web/uploads at runtime.

How Validation Schema is designed ?

The real source of truth is your business rule definition:

policyNumber is required
claimantName or insuredName is required
theft requires FIR / police report
third-party claim needs police report review
repair estimate alone is not enough
low confidence goes to review

How Validation Rules will change over time ?

How future workflow changes validation rules ?

1. Run Validation for valid-own-damage-claim.pdf