
Jina AI
@JinaAI_ • 17,192 subscribers
Your Search Foundation, Supercharged!
Shorts
Videos

jina-embeddings-v5-omni is here! Our first universal embedding model for text, images, audio, and video. Available in two sizes: small (1.57B, 1024-dim, 32K context) and nano (0.95B, 768-dim, 8K context). Both support Matryoshka truncation down to 32 dimensions. v5-omni is back-compatible: if you already use jina-embeddings-v5-text-small/nano, the existing text indexes work with v5-omni out of the box. Without reindexing the text, just index your multimodal content with v5-omni and start searching images, audio, and video.
Jina AI130,989 次观看 • 24 天前

curl This is our Meta-Prompt. It allows LLMs to understand our Reader, Embeddings, Reranker, and Classifier APIs for improved codegen. Using the meta-prompt is straightforward. Just copy the prompt into your preferred LLM interface like ChatGPT, Claude, or whatever works for you, add your instructions, and you're set. In this example, we copied the entire prompt into Anthropic Claude and asked it to grab every sentence from Hacker News front page and visualize them using UMAP with matplotlib. This task is nontrivial as it combines multiple APIs from our Search Foundation, like Reader and Embedding where Claude may not have knowledge of. So if you asked Claude directly, it probably wouldn't give an optimal answer. But with the meta-prompt, Claude now has good knowledge about our APIs and can generate much better code! We can copy paste the code directly to Google Colab and with minimum modification, the code just works!
Jina AI209,001 次观看 • 1 年前

Introducing jina-deepsearch-v1, it search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, search, read, reason, ... 🔄 until the best answer is found.
Jina AI83,892 次观看 • 1 年前

This idea is either extremely smart or an extremely stupid—no in-between. What if your LLM *is* your search engine? How would you look like inside it? Forget about Perplexity, DeepResearch. What if LLM is your entire Google? Pagination, links and everything - just like the old days: no chat UI, classic Google vibe. If you're unsure what's that mean, watch the demo video below first. We call it LLM-as-SERP (Search Engine Results Page).
Jina AI73,942 次观看 • 1 年前

One interesting question people ask us is: "How do you guys vibe-check your embeddings?" Sure, there's MTEB for more serious quantitative evaluation on public benchmarks, but what do you do for open-domain or new problem? Today we want to share a small internal tool we use for debugging and visualization. You can call it vibe-testing, we call it "Correlations" - and it's now open source on GitHub.
Jina AI33,915 次观看 • 1 年前

Most don't know (1) how easy it is to invert embedding vectors back into sentences, (2) this is a perfect task text diffusion models. Here's a 78M parameter model and live demo that recovers 80% of tokens from Qwen3-Embedding and EmbeddingGemma vectors. Works even on multilingual input.
Jina AI12,742 次观看 • 3 个月前

Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their org. positions. Think of it as a form of "compression"—you can adjust the top-k slider to dial in different "compress rates". Can you still make sense of the compressed text?
Jina AI13,161 次观看 • 11 个月前
没有更多内容可加载