Загрузка видео...

Не удалось загрузить видео

На главную

I just built an open NotebookLM clone! Here's what it can do for you: - Process multi-modal data - Scrape websites and YouTube videos - Create a unified knowledge base - Lets you do RAG over it - Remember every conversation - Generate a podcast 🎙️ The idea here...

83,187 просмотров • 8 месяцев назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Here is how you can install an open-source, enterprise-grade RAG system on your server (with the best document understanding I've seen.) First, something obvious to anyone trying to sell RAG in the market: You are crazy if you think companies will let their data travel to a hosted model. No one wants to send their data anywhere (those who do haven't found an alternative.) Every single company would rather have an air-gapped system with no internet access. GroundX is an open-source RAG system that you can run on your servers (or any cloud provider, as long as you have access to GPUs) and works without a network. (If the military wants to do RAG, this is precisely what they will be looking for.) I installed GroundX on my AWS account and recorded a video to show you how to use it. There are two services you can use: 1. Ingest: This service uses a pretrained vision model to ingest and understand your knowledge base. 2. Search: This service combines text and vector search with a fine-tuned re-ranker model to retrieve information from your knowledge base. A quick note about the Ingest service: 99% of people think they need better "retrieval" mechanisms. I think they need better "ingestion." That's where this service comes in! Ingest "understands" your documents in a way I haven't seen before. After you try it, you'll realize why showing your LLM your raw documents is a bad idea. In the video, I use a free tool called X-Ray to test a document and understand how the Ingest service breaks it down. You can access this tool by signing up for a free GroundX cloud account and uploading your documents. You'll see a bit more about this in the video.

Santiago

89,624 просмотров • 1 год назад

Be Smart as Karpathy Andrej Karpathy with Teamily AI 🧠 Your Personal Knowledge Base: ✅ Built in One Chat. 📈 Compounded via Conversations. Karpathy’s insight is spot on ( It attracts 10 million views in a few days. The idea is simple: AI should build personal knowledge from everything you feed it, so it stops rediscovering things from scratch like a Retrieval-Augmented Generation (RAG). But here’s the reality — most people aren’t Stanford PhD-level geeks like Karpathy. For the rest of us, operating a hacky collection of scripts and tools (Obsidian Web Clipper, Marp, Dataview, etc.) as seen in Karpathy’s idea file is far too complex ( The Internet needs an intuitive product where a personal knowledge base is a persistent, compounding artifact — one that grows alongside the content you consume, the contexts you inhabit, and the questions you ask. Teamily AI ( is the answer. The conversation IS the knowledge base. It’s an AI-native messenger where AI teammates join your chats. They remember your past discussions, your preferences, and your team’s context — getting smarter the more you talk. No setup. No complicated workflows. Just text as you normally do. Whether you’re saving articles and videos, brainstorming at work, or collaborating with colleagues, your AI teammates are right there. They listen, remember, and help — not from scratch every time, but by building a personal knowledge graph of everything you’re involved in. In essence, your knowledge compounds automatically. ✨ The user experience is effortless. Whenever you need a well-organized view of your data, just ask the "Personal AI" at the top of the Teamily window: "Visualize my personal knowledge base" Want to customize the style or indexes? Just chat with it. You define how you manage your knowledge. Our co-founder Aiden has prepared a short video to show you just how easy it is. 📽️

Teamily AI

15,400 просмотров • 2 месяцев назад

New short course Multimodal RAG: Chat with Videos, developed with Intel and taught by vasudevlal! In this course, you’ll work with LLaVA (Large Language and Vision Assistant), a Large Vision Language Model (LVLM) that can process both images and text. For example, given an image of a person doing a handstand on a skateboard at the beach, LLaVA doesn't just caption the scene, it’s able to predict possible outcomes, like the person losing balance or falling off. By understanding not just what's in a video frame, but what might happen next, your application can provide more insightful answers to questions about video. You'll build a full multimodal RAG pipeline that can chat about video content: - Use the BridgeTower model to create joint text-image embeddings in a 512-dimensional multimodal semantic space. - Learn video processing techniques to extract keyframes, generate transcripts using Whisper, and create captions. - Use the LanceDB vector database to store and retrieve high-dimensional multimodal embeddings. - Integrate the LLaVA model, combining CLIP's (Contrastive Language Image Pretraining) vision transformer with Llama, for advanced visual-textual reasoning. Your final system will ingest video data, generate embeddings for frames and text, perform similarity searches for relevant content, and use the retrieved multimodal context to inform LVLM-based response generation. The result is a system capable of answering nuanced questions about video content, effectively chatting about the video it has processed. Please sign up here!

Andrew Ng

107,548 просмотров • 1 год назад