Document QA pipeline using SentenceTransformers embeddings and local LLMs - private and secure. Original pipeline from imartinez and UI from SamurAI.
- Retrieve a list of documents
- Delete documents
- View which part of the source documents the LLM referred to
- Library versions up to date with original pipeline as of 3rd July 2023
- Python 3.8 or later
- NodeJS v18.12.1 or later
- Minimum 16GB of memory
-
Go to client folder and run the below commands
npm install
npm run dev
-
Go to server folder and run the below commands
pip install -r requirements.txt
python privateGPT.py
-
Open http://localhost:3000, click on download model to download the required model initially
-
Upload any document of your choice and click on Ingest data. Ingestion is fast
-
Now run any query on your data. Data querying is slow and thus wait for sometime
The supported extensions for documents are:
- .csv: CSV,
- .doc/.docx: Word Document,
- .enex: EverNote,
- .eml: Email,
- .epub: EPub,
- .html: HTML File,
- .md: Markdown,
- .odt: Open Document Text,
- .pdf: Portable Document Format (PDF),
- .ppt/.pptx : PowerPoint Document,
- .txt: Text file (UTF-8),