正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Large Language Models (LLM) Explained Briefly the best visual explanation that I saw of LLMs, source in comment below

Mohit Mishra

35,129 subscribers

15,704 次观看 • 1 年前 •via X (Twitter)

教育游戏科学技术

Anya Rossi• Live Now

Private livecam show

3 条评论

Mohit Mishra 的头像

Mohit Mishra1 年前

Source:

Tim Pratt 的头像

Tim Pratt1 年前

Great explainer! 🤓 Expanding on this: Beyond understanding parameters/weights, tools like adjusting temperature, prompt engineering, RAG, flow control, & fine-tuning can massively enhance results. Excited for future models with even larger parameters—better context = next-level insights!

Mohit Mishra 的头像

Mohit Mishra1 年前

yeah

相关视频

This is the best Visual Explanation of how LLMs actually work

This is the best Visual Explanation of how LLMs actually work

sachin.

345,928 次观看 • 4 个月前

The Qwen model family cyber atlas: a visual walkthrough of how AI really evolved — from large language models to embodied intelligence. It all begins with large language models. Watch the full video below.

The Qwen model family cyber atlas: a visual walkthrough of how AI really evolved — from large language models to embodied intelligence. It all begins with large language models. Watch the full video below.

Tongyi Lab

2,548,947 次观看 • 5 天前

EXPLAINED: What is an LLM? 🤔 Associate Prof Jakob Foerster shares everything you need to know about LLM (large language model) in 90 seconds. #OxfordAI

EXPLAINED: What is an LLM? 🤔 Associate Prof Jakob Foerster shares everything you need to know about LLM (large language model) in 90 seconds. #OxfordAI

University of Oxford

97,260 次观看 • 2 年前

Why is C Compiler So Smart? great explanation on the beauty of Compilers, I have attached the source and article link in below comment.

Why is C Compiler So Smart? great explanation on the beauty of Compilers, I have attached the source and article link in below comment.

Mohit Mishra

49,702 次观看 • 1 年前

TNT X-Space Series: Minister of Ministry of ICT and Innovation | Rwanda, Paula Ingabire, discusses the development of Kinyarwanda Large Language Models (LLMs).

TNT X-Space Series: Minister of Ministry of ICT and Innovation | Rwanda, Paula Ingabire, discusses the development of Kinyarwanda Large Language Models (LLMs).

The New Times (Rwanda)

32,980 次观看 • 1 年前

Andrej Karpathy calls large language models the new computing paradigm: CPU -> LLM bytes -> tokens RAM -> context window this is the large language model OS (LMOS)

Andrej Karpathy calls large language models the new computing paradigm: CPU -> LLM bytes -> tokens RAM -> context window this is the large language model OS (LMOS)

ℏεsam

343,327 次观看 • 1 年前

Learning to Decode Collaboratively with Multiple Language Models We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent

Learning to Decode Collaboratively with Multiple Language Models We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent

AK

52,644 次观看 • 2 年前

Today we introduce T-Free, a new paradigm in language processing. Tokenization is one of the core building blocks of large language models (LLMs), transforming natural language into numeric representations for further processing. (1/3) 🔗 #writtenbyalephalpha

Today we introduce T-Free, a new paradigm in language processing. Tokenization is one of the core building blocks of large language models (LLMs), transforming natural language into numeric representations for further processing. (1/3) 🔗 #writtenbyalephalpha

Aleph Alpha

18,121 次观看 • 1 年前

Geoffrey Hinton says AI models understand in the same way that people do and the best model we have of how the human brain works is large language models

Geoffrey Hinton says AI models understand in the same way that people do and the best model we have of how the human brain works is large language models

Tsarathustra

59,171 次观看 • 1 年前

Debate/ Oracle-Aggregated LLM Outputs Are a Myth - Chi Zhang from KITE AI - nalin 🇺🇸 from Google - @chiragdhull from Chainlink Labs As the use of Large Language Models (LLMs) accelerates, the question arises: can oracles aggregate and validate outputs from these AI systems to achieve reliable, decentralized insights? Full video below 👇🧵

Debate/ Oracle-Aggregated LLM Outputs Are a Myth - Chi Zhang from KITE AI - nalin 🇺🇸 from Google - @chiragdhull from Chainlink Labs As the use of Large Language Models (LLMs) accelerates, the question arises: can oracles aggregate and validate outputs from these AI systems to achieve reliable, decentralized insights? Full video below 👇🧵

ETHDenver 🏔🦬🦄

17,478 次观看 • 1 年前

Yann LeCun argues that large language models (LLMs) cannot reach human-level or superintelligence just by scaling. He says the current LLM paradigm is hitting its limits. Many researchers are now exploring “agentic systems,” but building them on top of LLMs alone is flawed. LLMs can't plan actions well because they don’t truly understand or predict consequences. To get intelligent behavior, we need something fundamentally different.

Yann LeCun argues that large language models (LLMs) cannot reach human-level or superintelligence just by scaling. He says the current LLM paradigm is hitting its limits. Many researchers are now exploring “agentic systems,” but building them on top of LLMs alone is flawed. LLMs can't plan actions well because they don’t truly understand or predict consequences. To get intelligent behavior, we need something fundamentally different.

Wes Roth

71,832 次观看 • 5 个月前

Met a guy making $1.6 million/year as an LLM engineer. I asked him how he learned LLMs from scratch. He sent me the exact video that got him in. A 1 hour course on how LLMs actually work. He shows how transformers inside LLMs like ChatGPT & Claude are actually built. I watched it last night. Halfway through, I realized LLM architecture is way simpler than they make it look. Bookmark this and read the article below. • 00:00 - LLM foundations • 04:21 - LLM tokenization • 05:43 - LLMs vector embeddings • 22:16 - attention mechanism of LLM • 43:42 - LLM multi head attention

Met a guy making $1.6 million/year as an LLM engineer. I asked him how he learned LLMs from scratch. He sent me the exact video that got him in. A 1 hour course on how LLMs actually work. He shows how transformers inside LLMs like ChatGPT & Claude are actually built. I watched it last night. Halfway through, I realized LLM architecture is way simpler than they make it look. Bookmark this and read the article below. • 00:00 - LLM foundations • 04:21 - LLM tokenization • 05:43 - LLMs vector embeddings • 22:16 - attention mechanism of LLM • 43:42 - LLM multi head attention

Roan

73,925 次观看 • 2 天前

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models paper page: github: Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, these models still encounter difficulties when generating images from prompts that demand spatial or common sense reasoning. We propose to equip diffusion models with enhanced reasoning capabilities by using off-the-shelf pretrained large language models (LLMs) in a novel two-stage generation process. First, we adapt an LLM to be a text-guided layout generator through in-context learning. When provided with an image prompt, an LLM outputs a scene layout in the form of bounding boxes along with corresponding individual descriptions. Second, we steer a diffusion model with a novel controller to generate images conditioned on the layout. Both stages utilize frozen pretrained models without any LLM or diffusion model parameter optimization. We validate the superiority of our design by demonstrating its ability to outperform the base diffusion model in accurately generating images according to prompts that necessitate both language and spatial reasoning. Additionally, our method naturally allows dialog-based scene specification and is able to handle prompts in a language that is not well-supported by the underlying diffusion model.

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models paper page: github: Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, these models still encounter difficulties when generating images from prompts that demand spatial or common sense reasoning. We propose to equip diffusion models with enhanced reasoning capabilities by using off-the-shelf pretrained large language models (LLMs) in a novel two-stage generation process. First, we adapt an LLM to be a text-guided layout generator through in-context learning. When provided with an image prompt, an LLM outputs a scene layout in the form of bounding boxes along with corresponding individual descriptions. Second, we steer a diffusion model with a novel controller to generate images conditioned on the layout. Both stages utilize frozen pretrained models without any LLM or diffusion model parameter optimization. We validate the superiority of our design by demonstrating its ability to outperform the base diffusion model in accurately generating images according to prompts that necessitate both language and spatial reasoning. Additionally, our method naturally allows dialog-based scene specification and is able to handle prompts in a language that is not well-supported by the underlying diffusion model.

AK

83,657 次观看 • 3 年前

.SnowflakeDB is thrilled to announce #SnowflakeArctic: A state-of-the-art large language model uniquely designed to be the most open, enterprise-grade LLM on the market. This is a big step forward for open source LLMs. And it’s a big moment for Snowflake in our #AI journey as we continue to build best-in-class enterprise-grade products for our customers. The era of enterprise AI is here. 🚀

.SnowflakeDB is thrilled to announce #SnowflakeArctic: A state-of-the-art large language model uniquely designed to be the most open, enterprise-grade LLM on the market. This is a big step forward for open source LLMs. And it’s a big moment for Snowflake in our #AI journey as we continue to build best-in-class enterprise-grade products for our customers. The era of enterprise AI is here. 🚀

sridhar

243,719 次观看 • 2 年前

3D-LLM: Injecting the 3D World into Large Language Models paper page: Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be, they are not grounded in the 3D physical world, which involves richer concepts such as spatial relationships, affordances, physics, layout, and so on. In this work, we propose to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs. Specifically, 3D-LLMs can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on. Using three types of prompting mechanisms that we design, we are able to collect over 300k 3D-language data covering these tasks. To efficiently train 3D-LLMs, we first utilize a 3D feature extractor that obtains 3D features from rendered multi- view images. Then, we use 2D VLMs as our backbones to train our 3D-LLMs. By introducing a 3D localization mechanism, 3D-LLMs can better capture 3D spatial information. Experiments on ScanQA show that our model outperforms state-of-the-art baselines by a large margin (e.g., the BLEU-1 score surpasses state-of-the-art score by 9%). Furthermore, experiments on our held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue show that our model outperforms 2D VLMs. Qualitative examples also show that our model could perform more tasks beyond the scope of existing LLMs and VLMs.

3D-LLM: Injecting the 3D World into Large Language Models paper page: Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be, they are not grounded in the 3D physical world, which involves richer concepts such as spatial relationships, affordances, physics, layout, and so on. In this work, we propose to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs. Specifically, 3D-LLMs can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on. Using three types of prompting mechanisms that we design, we are able to collect over 300k 3D-language data covering these tasks. To efficiently train 3D-LLMs, we first utilize a 3D feature extractor that obtains 3D features from rendered multi- view images. Then, we use 2D VLMs as our backbones to train our 3D-LLMs. By introducing a 3D localization mechanism, 3D-LLMs can better capture 3D spatial information. Experiments on ScanQA show that our model outperforms state-of-the-art baselines by a large margin (e.g., the BLEU-1 score surpasses state-of-the-art score by 9%). Furthermore, experiments on our held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue show that our model outperforms 2D VLMs. Qualitative examples also show that our model could perform more tasks beyond the scope of existing LLMs and VLMs.

AK

249,708 次观看 • 3 年前

Just dropped a 4 hour lecture on "Large Language Models": 0:00 Basics of language models 2:30 Word2vec 16:27 Transfer Learning 19:23 BERT 1:00:39 T5 1:31:14 GPT1-3 1:53:05 ChatGPT 2:20:03 LLMs as Deep RL 2:53:00 Policy Gradient 3:32:50 Train your own LLM

Just dropped a 4 hour lecture on "Large Language Models": 0:00 Basics of language models 2:30 Word2vec 16:27 Transfer Learning 19:23 BERT 1:00:39 T5 1:31:14 GPT1-3 1:53:05 ChatGPT 2:20:03 LLMs as Deep RL 2:53:00 Policy Gradient 3:32:50 Train your own LLM

Soheil Feizi

217,255 次观看 • 2 年前

Meet #DBRX: a general-purpose LLM that sets a new standard for efficient open source models. Use the DBRX model in your RAG apps or use the DBRX design to build your own custom LLMs and improve the quality of your GenAI applications.

Meet #DBRX: a general-purpose LLM that sets a new standard for efficient open source models. Use the DBRX model in your RAG apps or use the DBRX design to build your own custom LLMs and improve the quality of your GenAI applications.

Databricks

327,781 次观看 • 2 年前

The most clearest and crisp explanation, I've ever heard, of how large language models compress and capture a "world-model" in their weights simply by learning to predict the next word accurately. Furthermore, how the raw power of these base models can then be tamed by teaching them to follow instructions from humans. Source:

The most clearest and crisp explanation, I've ever heard, of how large language models compress and capture a "world-model" in their weights simply by learning to predict the next word accurately. Furthermore, how the raw power of these base models can then be tamed by teaching them to follow instructions from humans. Source:

Zain

970,987 次观看 • 2 年前