Loading video...
Video Failed to Load
I'm excited to announce Semantra: an open source multi-tool for semantic search 🎉 - Launch a local search engine over text and PDF files - Search by concepts/meaning - Refine results via tagging and adding/subtracting queries Try it out now 🚀📚🔍
130,360 views • 3 years ago •via X (Twitter)
10 Comments

Semantra is built for those seeking needles in haystacks: journalists, researchers, students, and more. I've found it useful personally across a wide range of content, including books, reports, speeches, and government documents. Tutorial:

Semantra runs locally, keeping your data safe, or it can optionally use OpenAI's paid embedding models to offload computation. Install with Python/pipx: ``` python3 -m pip install --user pipx python3 -m pipx ensurepath ``` In a new terminal, run: ``` pipx install semantra ```

To run Semantra over a collection of documents (text or pdf): ``` semantra <filenames> ``` It will download embedding models as needed, analyze the documents in chunks, and launch a local web app for interactive analysis ✨

Here's an example using Semantra on a collection of US inaugural speeches. You can play with this document collection in the tutorial After downloading the documents, analyze them all at once with: ``` semantra us_inaugural_speeches/*.txt ```

Semantra is full of flexible options: you can run any Hugging Face transformers model, change the window sizes for the embeddings, switch up the results algorithm, and more. Processed documents are cached by content so Semantra only ever does the initial processing work once.

I wrote documentation for Semantra in hopes it will be serviceable. Please let me know if you have any feedback, encounter any issues, or have any suggestions/ideas! Repo: Tutorial: Guides:

Phenomenal, will be checking this out today.

This is amazing! Installing on my machine asap. I can't wait to see if I can go through textbooks faster.

Awesome! Let me know it goes

I was hoping to see something like this a long time ago, really happy to have stumbled upon your tweet today 😍😍😍


