Your repo is still correctly shaped for the 8-week ClaimFlow AI roadmap: root has apps/web, packages, docs, and sample-data; apps/web contains the Next.js app, API routes, dashboard, and run detail pages; packages contains ai, db, shared, and supporting workspace packages. (GitHub) (GitHub) (GitHub)
Current backend foundation is good for Week 3:
Document, ExtractionRun, ExtractionEvent, ReviewTask, ReviewDecision, and ReviewEvent already exist in Prisma. ExtractionRun.status is still correctly limited to UPLOADED, EXTRACTING, VALIDATING, COMPLETED, NEEDS_REVIEW, FAILED; human review state is separate in ReviewTaskStatus.@repo/ai, stores rawModelOutput, extractedJson, confidenceJson, and moves the run to VALIDATING.validateClaimExtraction(), saves validationJson, creates timeline events, and creates a ReviewTask when final status is NEEDS_REVIEW.missingFields, conflicts, warnings, and requiredEvidence inside reasonJson.packages/evals already exists and has Week 1 / Week 2 eval scripts wired through root package scripts, so Week 3 should add one focused RAG eval script instead of inventing a new eval framework.One important technical gap: your current docker-compose.yml uses plain postgres:16, not a pgvector-enabled image. For Week 3, you either need to switch to a pgvector Postgres image or store vectors as JSON/float arrays and do cosine similarity in TypeScript for the first version. Since this week is about learning real RAG, I recommend pgvector.
The original 8-week plan is:
| Week | Product demo | Topic |
|---|---|---|
| Week 1 | Upload doc → structured JSON | Document AI |
| Week 2 | Review queue + workflow states | Durable orchestration + HITL |
| Week 3 | Policy-grounded answer | RAG + citations |
| Week 4 | Agent chooses next action | Tool calling + guardrails |
| Week 5 | Uses past corrections | Memory |
| Week 6 | AI gateway + eval dashboard | Governance + observability |
| Week 7 | Mini repo assistant | Terminal agents |
| Week 8 | Fine-tuning decision report | Tuning judgment |
So Week 3 should not become “ask Gemini with policy text.” The Week 3 product should be:
User asks a coverage question → system retrieves relevant policy clauses → answer cites exact policy chunks → if evidence is missing, answer refuses or marks
NEEDS_REVIEW→ eval checks retrieval and citation support.
That is exactly the right Applied AI direction.