top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

SEEK (Semantic Embedding & Extraction Kit)

Tech Stack

Python, Hugging Face, FAISS, OpenRouter, DeepSeek

• Developed a scalable Retrieval-Augmented Generation (RAG) pipeline to extract and retrieve insights from over 100+ GB PDF corpus, performing data extraction, semantic chunking, and embedding using Hugging Face’s BAAI/bge-base-en-v1.5 model.
• Implemented FAISS for vector search and evaluated IVFPQ indexing as an optimal approach (less than 100 ms latency) for large-scale retrieval, combined with query expansion techniques to improve recall.
• Deployed a zero-cost GenAI chatbot powered by DeepSeek LLM via OpenRouter, producing citation-backed, hallucination-resistant answers, and validated the pipeline’s economic and technical feasibility through performance benchmarking.

bottom of page