正在加载视频...

视频加载失败

Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their...

13,173 次观看 • 11 个月前 •via X (Twitter)

3 条评论

Jina AI 的头像
Jina AI11 个月前

Try it on Google Colab's L4 GPU for free: This could be an interesting approach for extracting information from long documents, saving tokens for LLMs, etc. Check out our recent blog posts and learn more about submodular optimization.

Richard Collins, The Internet Foundation 的头像
Richard Collins, The Internet Foundation11 个月前

Can you scale to replace Google? Put your AI on it and put in the numbers. Back of the envelope or "in a spreadsheet" is better than "in your head somewhere as an idea only". It might be easier than you think now. If the whole Internet is coded as it goes in, not scraped and indexed and tokenized later - completely separated from the authors, without their permission or help. Check my writing on "global open tokens" where all tokens are linked to the real things in the world - not arbitrary strings of characters in one language. Using universal (global) tokens means "the sun", "the earth", "water" and those are independent of human language so ties things together. Yes, choose the things that matter, keep it lean and sufficient and sustainable, not shotgun or brute force, only for people with big computers. For all humans, not just a few. Richard Collins, The Internet Foundation

Franck Lebeau 的头像
Franck Lebeau11 个月前

interesting how "Late chucking" is condensed into "lateing" (tokens "late" + "##ing"). As I understand it, it means that the semantic of "chuncking" (tokens "chunck"+"#ing") is mainly supported by the contextualized embedding of the "#ing".

相关视频