Jina AI's banner

Jina AI

@JinaAI_ • 17,299 subscribers

Your Search Foundation, Supercharged! (acquired by @elastic Oct. 2025)

Shorts

One cool thing about ColBERT-based search compared to the cosine-based vector retrieval is that you get interpretability for free as a byproduct of the MaxSim computation. It's kind of like the Lucene highlighter, letting you grab the most relevant snippets from a long document to show users where their query matches. With Jina-ColBERT-v1, which supports up to 8K token length, released by us earlier this Feb., the visualization of the late interaction between a query and a document is almost... artistic. The video shows the late interaction between the query "Elephants eat 150 kg of food per day." and the Wikipedia article about "Indian Elephant". Darker colors indicate stronger semantic matches. The darkest area corresponds to "The species is classified as a megaherbivore and consume up to 150 kg (330 lb) of plant matter per day." from the original article.

One cool thing about ColBERT-based search compared to the cosine-based vector retrieval is that you get interpretability for free as a byproduct of the MaxSim computation. It's kind of like the Lucene highlighter, letting you grab the most relevant snippets from a long document to show users where their query matches. With Jina-ColBERT-v1, which supports up to 8K token length, released by us earlier this Feb., the visualization of the late interaction between a query and a document is almost... artistic. The video shows the late interaction between the query "Elephants eat 150 kg of food per day." and the Wikipedia article about "Indian Elephant". Darker colors indicate stronger semantic matches. The darkest area corresponds to "The species is classified as a megaherbivore and consume up to 150 kg (330 lb) of plant matter per day." from the original article.

22,268 次观看

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

jina-embeddings-v5-omni is here! Our first universal embedding model for text, images, audio, and video. Available in two sizes: small (1.57B, 1024-dim, 32K context) and nano (0.95B, 768-dim, 8K context). Both support Matryoshka truncation down to 32 dimensions. v5-omni is back-compatible: if you already use jina-embeddings-v5-text-small/nano, the existing text indexes work with v5-omni out of the box. Without reindexing the text, just index your multimodal content with v5-omni and start searching images, audio, and video.

jina-embeddings-v5-omni is here! Our first universal embedding model for text, images, audio, and video. Available in two sizes: small (1.57B, 1024-dim, 32K context) and nano (0.95B, 768-dim, 8K context). Both support Matryoshka truncation down to 32 dimensions. v5-omni is back-compatible: if you already use jina-embeddings-v5-text-small/nano, the existing text indexes work with v5-omni out of the box. Without reindexing the text, just index your multimodal content with v5-omni and start searching images, audio, and video.

134,015 次观看 • 2 个月前

curl This is our Meta-Prompt. It allows LLMs to understand our Reader, Embeddings, Reranker, and Classifier APIs for improved codegen. Using the meta-prompt is straightforward. Just copy the prompt into your preferred LLM interface like ChatGPT, Claude, or whatever works for you, add your instructions, and you're set. In this example, we copied the entire prompt into Anthropic Claude and asked it to grab every sentence from Hacker News front page and visualize them using UMAP with matplotlib. This task is nontrivial as it combines multiple APIs from our Search Foundation, like Reader and Embedding where Claude may not have knowledge of. So if you asked Claude directly, it probably wouldn't give an optimal answer. But with the meta-prompt, Claude now has good knowledge about our APIs and can generate much better code! We can copy paste the code directly to Google Colab and with minimum modification, the code just works!

curl This is our Meta-Prompt. It allows LLMs to understand our Reader, Embeddings, Reranker, and Classifier APIs for improved codegen. Using the meta-prompt is straightforward. Just copy the prompt into your preferred LLM interface like ChatGPT, Claude, or whatever works for you, add your instructions, and you're set. In this example, we copied the entire prompt into Anthropic Claude and asked it to grab every sentence from Hacker News front page and visualize them using UMAP with matplotlib. This task is nontrivial as it combines multiple APIs from our Search Foundation, like Reader and Embedding where Claude may not have knowledge of. So if you asked Claude directly, it probably wouldn't give an optimal answer. But with the meta-prompt, Claude now has good knowledge about our APIs and can generate much better code! We can copy paste the code directly to Google Colab and with minimum modification, the code just works!

209,078 次观看 • 1 年前

Introducing jina-deepsearch-v1, it search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, ... 🔄 until the best answer is found.

Introducing jina-deepsearch-v1, it search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, ... 🔄 until the best answer is found.

83,901 次观看 • 1 年前

This idea is either extremely smart or an extremely stupid—no in-between. What if your LLM *is* your search engine? How would you look like inside it? Forget about Perplexity, DeepResearch. What if LLM is your entire Google? Pagination, links and everything - just like the old days: no chat UI, classic Google vibe. If you're unsure what's that mean, watch the demo video below first. We call it LLM-as-SERP (Search Engine Results Page).

This idea is either extremely smart or an extremely stupid—no in-between. What if your LLM is your search engine? How would you look like inside it? Forget about Perplexity, DeepResearch. What if LLM is your entire Google? Pagination, links and everything - just like the old days: no chat UI, classic Google vibe. If you're unsure what's that mean, watch the demo video below first. We call it LLM-as-SERP (Search Engine Results Page).

73,960 次观看 • 1 年前

One interesting question people ask us is: "How do you guys vibe-check your embeddings?" Sure, there's MTEB for more serious quantitative evaluation on public benchmarks, but what do you do for open-domain or new problem? Today we want to share a small internal tool we use for debugging and visualization. You can call it vibe-testing, we call it "Correlations" - and it's now open source on GitHub.

One interesting question people ask us is: "How do you guys vibe-check your embeddings?" Sure, there's MTEB for more serious quantitative evaluation on public benchmarks, but what do you do for open-domain or new problem? Today we want to share a small internal tool we use for debugging and visualization. You can call it vibe-testing, we call it "Correlations" - and it's now open source on GitHub.

33,931 次观看 • 1 年前

Most don't know (1) how easy it is to invert embedding vectors back into sentences, (2) this is a perfect task text diffusion models. Here's a 78M parameter model and live demo that recovers 80% of tokens from Qwen3-Embedding and EmbeddingGemma vectors. Works even on multilingual input.

Most don't know (1) how easy it is to invert embedding vectors back into sentences, (2) this is a perfect task text diffusion models. Here's a 78M parameter model and live demo that recovers 80% of tokens from Qwen3-Embedding and EmbeddingGemma vectors. Works even on multilingual input.

12,977 次观看 • 5 个月前

Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their org. positions. Think of it as a form of "compression"—you can adjust the top-k slider to dial in different "compress rates". Can you still make sense of the compressed text?

Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their org. positions. Think of it as a form of "compression"—you can adjust the top-k slider to dial in different "compress rates". Can you still make sense of the compressed text?

13,177 次观看 • 1 年前

没有更多内容可加载