Video yükleniyor...
Video Yüklenemedi
Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their... show more
13,161 görüntüleme • 11 ay önce •via X (Twitter)
3 Yorum

Try it on Google Colab's L4 GPU for free: This could be an interesting approach for extracting information from long documents, saving tokens for LLMs, etc. Check out our recent blog posts and learn more about submodular optimization.

Can you scale to replace Google? Put your AI on it and put in the numbers. Back of the envelope or "in a spreadsheet" is better than "in your head somewhere as an idea only". It might be easier than you think now. If the whole Internet is coded as it goes in, not scraped and indexed and tokenized later - completely separated from the authors, without their permission or help. Check my writing on "global open tokens" where all tokens are linked to the real things in the world - not arbitrary strings of characters in one language. Using universal (global) tokens means "the sun", "the earth", "water" and those are independent of human language so ties things together. Yes, choose the things that matter, keep it lean and sufficient and sustainable, not shotgun or brute force, only for people with big computers. For all humans, not just a few. Richard Collins, The Internet Foundation

interesting how "Late chucking" is condensed into "lateing" (tokens "late" + "##ing"). As I understand it, it means that the semantic of "chuncking" (tokens "chunck"+"#ing") is mainly supported by the contextualized embedding of the "#ing".
