An Intelligent document processing platform that classifies, extracts entities, summarizes, and answers
natural laungauge questions against uploaded documents using Retrieval Augmented Generation (RAG).
Java 21Spring BootSpring aiReactopenaipgvector
// Impact: Ai powered RAG implementation for citation answers
{ FEATURES }
✓Engineered a RAG pipeline using Spring AI and pgvector: documents are parsed via Apache Tika, split into
token-aware chunks, embedded as 1536-dim vectors, and queried via cosine similarity to ground LLM
answers in actual document content
✓Built structured AI output for document classification and named-entity extraction using Spring AI's
BeanOutputConverter, delivering type-safe JSON responses with confidence scores — cached in Redis to
eliminate redundant LLM calls
✓Designed async document ingestion on Java 21 virtual threads (upload returns HTTP 202), with idempotent
uploads via unique request keys, dual authentication (session tokens + BCrypt-hashed API keys), and a
React + shadcn/ui frontend for document management and chat-based Q&A