Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Transformer Explainer Interactive Learning of Text-Generative Models discuss: Transformers have revolutionized machine learning, yet their inner workings remain opaque to many. We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. Our tool helps users understand complex Transformer concepts by... integrating a model overview and enabling smooth transitions across abstraction levels of mathematical operations and model structures. It runs a live GPT-2 instance locally in the user's browser, empowering users to experiment with their own input and observe in real-time how the internal components and parameters of the Transformer work together to predict the next tokens. Our tool requires no installation or special hardware, broadening the public's education access to modern generative AI techniques.show more

AK

510,751 subscribers

90,798 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Kunst Bildung

Anya Rossi• Live Now

Private livecam show

5 Kommentare

Profilbild von Duen Horng "Polo" Chau

Duen Horng "Polo" Chauvor 1 Jahr

Thanks @_akhaliq for sharing Transformer Explainer! Hope everyone enjoys playing with the tool!

Profilbild von Duen Horng "Polo" Chau

Duen Horng "Polo" Chauvor 1 Jahr

👏Congrats to Transformer Explainer co-leads @cho_aeree @gracekimcy @alexkarpekov ; and @Jay4w @SeongminLeee @alec_helbling @Ben_Hoov at @gtcomputing !

Profilbild von Michael Buloichyk

Michael Buloichykvor 1 Jahr

That’s can be a huge addition to existing @karpathy tutorial on building GPT2 from scratch I presume

Profilbild von Joseph David Huelbig

Joseph David Huelbigvor 1 Jahr

I was hoping for the other kind of transformer.. @transformers

Profilbild von OSDev

OSDevvor 1 Jahr

rst @readwise save thread

Ähnliche Videos

Transformer Explainer Really cool interactive tool to learn about the inner workings of a Transformer model. Apparently, it runs a GPT-2 instance locally in the user's browser and allows you to experiment with your own inputs. This is a nice tool to learn more about the different components inside the Transformer and the transformations that occur. Tool:

Transformer Explainer Really cool interactive tool to learn about the inner workings of a Transformer model. Apparently, it runs a GPT-2 instance locally in the user's browser and allows you to experiment with your own inputs. This is a nice tool to learn more about the different components inside the Transformer and the transformations that occur. Tool:

elvis

121,921 Aufrufe • vor 1 Jahr

Announcing How Transformer LLMs Work, created with Jay Alammar and Maarten Grootendorst, co-authors of the beautifully illustrated book, “Hands-On Large Language Models.” This course offers a deep dive into the inner workings of the transformer architecture that powers large language models (LLMs). The transformer architecture revolutionized generative AI; in fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer." Originally introduced in the Google Brain team's groundbreaking 2017 paper "Attention Is All You Need," by Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, Anthropic and DeepSeek. In this course, you’ll learn in detail how LLMs process text. You'll also work through code examples that illustrate that transformer's individual components. In details, you’ll learn: - How the representation of language has evolved, from Bag-of-Words to Word2Vec embeddings to the transformer architecture that captures a word's meanings taking into account the context of other words in the input. - How inputs are broken down into tokens before they are sent to the language model. - The details of a transformer's main stages: Tokenization and embedding, the stack of transformer blocks, and the language model head. - The inner workings of the transformer block, including attention, which calculates relevance scores, and the feedforward layer, which incorporates stored information learned in training. - How cached calculations make transformers faster. - Some of the most recent ideas in the latest models such as Mixture-of-Experts (MoE) which uses multiple sub-models and a router on each layer to improve the quality of LLMs. By the end of this course, you’ll have a deep understanding of how LLMs actually process text and be able to read through papers describing the latest models and understand the details. Gaining this intuition will improve your approach to building LLM applications. Please sign up here:

Announcing How Transformer LLMs Work, created with Jay Alammar and Maarten Grootendorst, co-authors of the beautifully illustrated book, “Hands-On Large Language Models.” This course offers a deep dive into the inner workings of the transformer architecture that powers large language models (LLMs). The transformer architecture revolutionized generative AI; in fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer." Originally introduced in the Google Brain team's groundbreaking 2017 paper "Attention Is All You Need," by Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, Anthropic and DeepSeek. In this course, you’ll learn in detail how LLMs process text. You'll also work through code examples that illustrate that transformer's individual components. In details, you’ll learn: - How the representation of language has evolved, from Bag-of-Words to Word2Vec embeddings to the transformer architecture that captures a word's meanings taking into account the context of other words in the input. - How inputs are broken down into tokens before they are sent to the language model. - The details of a transformer's main stages: Tokenization and embedding, the stack of transformer blocks, and the language model head. - The inner workings of the transformer block, including attention, which calculates relevance scores, and the feedforward layer, which incorporates stored information learned in training. - How cached calculations make transformers faster. - Some of the most recent ideas in the latest models such as Mixture-of-Experts (MoE) which uses multiple sub-models and a router on each layer to improve the quality of LLMs. By the end of this course, you’ll have a deep understanding of how LLMs actually process text and be able to read through papers describing the latest models and understand the details. Gaining this intuition will improve your approach to building LLM applications. Please sign up here:

Andrew Ng

259,421 Aufrufe • vor 1 Jahr

We just released our interview with the father of Generative AI - Jürgen Schmidhuber! The G, P, and T in "ChatGPT" (GPT means "Generative Pre-Trained Transformer") go back to Juergen's work of 1990-91 when he published what's now called "Unnormalised Linear Transformers," "Self-Supervised Pre-Training" for deep learning with long texts, and "Generative Adversarial Networks" for Artificial Curiosity. Remarkably, principles of both Transformers and LSTMs date back to 1991, the only palindromic year of the 20th century! Transformers are easier to parallelise, but LSTMs can solve problems which are unsolvable by Transformers. In this first part of our two part show, we discuss the history and the future of the field, with a focus on abstract planning, reasoning, and "learning to think." It's just dropped on MLST!

We just released our interview with the father of Generative AI - Jürgen Schmidhuber! The G, P, and T in "ChatGPT" (GPT means "Generative Pre-Trained Transformer") go back to Juergen's work of 1990-91 when he published what's now called "Unnormalised Linear Transformers," "Self-Supervised Pre-Training" for deep learning with long texts, and "Generative Adversarial Networks" for Artificial Curiosity. Remarkably, principles of both Transformers and LSTMs date back to 1991, the only palindromic year of the 20th century! Transformers are easier to parallelise, but LSTMs can solve problems which are unsolvable by Transformers. In this first part of our two part show, we discuss the history and the future of the field, with a focus on abstract planning, reasoning, and "learning to think." It's just dropped on MLST!

Machine Learning Street Talk

100,057 Aufrufe • vor 1 Jahr

This is probably the most entertaining way to understand one of AI’s hardest AI debates. Transformer vs Post-Transformer, argued by leading researchers, inside a real physical boxing ring. Both technically deep and genuinely entertaining. I was glued for the entire 1 hour 20 minutes. So many super cool points to learn. 🥊 Transformers - Transformers still own the present because they work at scale. They are simple, trainable, hardware-friendly, and already power the strongest AI systems we use today. - The Transformer is basically a memory machine. It stores information as keys and values, then uses attention to pull back the most useful parts when answering. - The real Transformer advantage is not just “attention.” The bigger advantage is that it fits modern hardware extremely well, so it can process huge batches of tokens fast. - Scaling is still the brutal rule. If you give Transformers more compute, more data, and more parameters, they usually keep getting better. Any Post-Transformer architecture has to scale just as well, or better. - It is not enough to look clever on small tests, because the real question is whether it improves faster than Transformers when scaled up. - A replacement cannot be slightly better. Because the whole AI stack is already built around Transformers, the next architecture may need to be around 10x better to force everyone to switch. - Transformers are powerful, but they may be brute force. A human does not need to read the entire internet many times to become smart, but current LLMs need enormous data and compute. 🥊 Post-Transformer - Post-Transformer people are not saying Transformers are bad. They are saying Transformers may be the best current tool, not the final form of machine intelligence. - The biggest Post-Transformer target is native reasoning and continual learning. Today’s LLM reasoning often feels like text-based step-by-step work added on top, instead of thinking happening naturally inside the model. - Latent reasoning is one possible next step. That means the model reasons inside its own hidden internal space, instead of writing every thought out as words. - Continual learning is still a major weakness. Humans keep learning from experience, but most Transformer-based models are trained, frozen, and then only adapt inside the prompt. - Long context is not the same as real memory. A model can read a huge prompt, but that is different from building a life history, learning from mistakes, and updating beliefs over time. - The future may be hybrid, not a clean replacement. Transformers may stay as 1 building block while newer systems add better memory, better reasoning, and better learning loops. - The most interesting possibility is that Transformers may help discover their own successor. AI agents are already getting better at research and coding, so the next architecture may come from AI-assisted architecture search. ------- - Benchmarks are a problem. Many public benchmarks are easy to game, so they may show leaderboard strength without proving deeper intelligence. - Perplexity is still probably a great metric to evaluate frontier models,, because it tests prediction quality. --- Overall, Transformers continue to dominate, but the frontier is clearly widening. Pathway’s BDH (Dragon Hatchling — brain-inspired reasoning architecture), Sakana AI’s CTMs (Continuous Thought Machines — models that think over time), and Liquid AI’s LFMs (Liquid Foundation Models — efficient multimodal foundation models) - all of these show how the frontier is expanding. --- From “Pathway (pathway[.]com)” Youtube channel (link in comment) Zuzanna Stamirowska

This is probably the most entertaining way to understand one of AI’s hardest AI debates. Transformer vs Post-Transformer, argued by leading researchers, inside a real physical boxing ring. Both technically deep and genuinely entertaining. I was glued for the entire 1 hour 20 minutes. So many super cool points to learn. 🥊 Transformers - Transformers still own the present because they work at scale. They are simple, trainable, hardware-friendly, and already power the strongest AI systems we use today. - The Transformer is basically a memory machine. It stores information as keys and values, then uses attention to pull back the most useful parts when answering. - The real Transformer advantage is not just “attention.” The bigger advantage is that it fits modern hardware extremely well, so it can process huge batches of tokens fast. - Scaling is still the brutal rule. If you give Transformers more compute, more data, and more parameters, they usually keep getting better. Any Post-Transformer architecture has to scale just as well, or better. - It is not enough to look clever on small tests, because the real question is whether it improves faster than Transformers when scaled up. - A replacement cannot be slightly better. Because the whole AI stack is already built around Transformers, the next architecture may need to be around 10x better to force everyone to switch. - Transformers are powerful, but they may be brute force. A human does not need to read the entire internet many times to become smart, but current LLMs need enormous data and compute. 🥊 Post-Transformer - Post-Transformer people are not saying Transformers are bad. They are saying Transformers may be the best current tool, not the final form of machine intelligence. - The biggest Post-Transformer target is native reasoning and continual learning. Today’s LLM reasoning often feels like text-based step-by-step work added on top, instead of thinking happening naturally inside the model. - Latent reasoning is one possible next step. That means the model reasons inside its own hidden internal space, instead of writing every thought out as words. - Continual learning is still a major weakness. Humans keep learning from experience, but most Transformer-based models are trained, frozen, and then only adapt inside the prompt. - Long context is not the same as real memory. A model can read a huge prompt, but that is different from building a life history, learning from mistakes, and updating beliefs over time. - The future may be hybrid, not a clean replacement. Transformers may stay as 1 building block while newer systems add better memory, better reasoning, and better learning loops. - The most interesting possibility is that Transformers may help discover their own successor. AI agents are already getting better at research and coding, so the next architecture may come from AI-assisted architecture search. ------- - Benchmarks are a problem. Many public benchmarks are easy to game, so they may show leaderboard strength without proving deeper intelligence. - Perplexity is still probably a great metric to evaluate frontier models,, because it tests prediction quality. --- Overall, Transformers continue to dominate, but the frontier is clearly widening. Pathway’s BDH (Dragon Hatchling — brain-inspired reasoning architecture), Sakana AI’s CTMs (Continuous Thought Machines — models that think over time), and Liquid AI’s LFMs (Liquid Foundation Models — efficient multimodal foundation models) - all of these show how the frontier is expanding. --- From “Pathway (pathway[.]com)” Youtube channel (link in comment) Zuzanna Stamirowska

Rohan Paul

89,110 Aufrufe • vor 1 Monat

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow inference, and make smarter decisions about deployment. This course is built in partnership with AMD and taught by Sharon Zhou. You'll see how transformers generate text one token at a time, how the model decides which earlier words matter most when predicting the next one, and how techniques like quantization speed up inference on GPUs. This is not a video-only course; interactive visualizations throughout let you play with these concepts and build intuition that sticks. Skills you'll gain: - Understand why LLMs hallucinate, and RAG and chain-of-thought shape what they generate - Look inside the model to see how attention and layers combine to predict the next token - Diagnose inference bottlenecks and learn the techniques that speed up transformers on GPUs Join and understand what's really happening inside your LLMs:

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow inference, and make smarter decisions about deployment. This course is built in partnership with AMD and taught by Sharon Zhou. You'll see how transformers generate text one token at a time, how the model decides which earlier words matter most when predicting the next one, and how techniques like quantization speed up inference on GPUs. This is not a video-only course; interactive visualizations throughout let you play with these concepts and build intuition that sticks. Skills you'll gain: - Understand why LLMs hallucinate, and RAG and chain-of-thought shape what they generate - Look inside the model to see how attention and layers combine to predict the next token - Diagnose inference bottlenecks and learn the techniques that speed up transformers on GPUs Join and understand what's really happening inside your LLMs:

Andrew Ng

118,911 Aufrufe • vor 2 Monaten

OpenAI shipped a new speech-to-speech model today: gpt-realtime-2 This is the first speech-to-speech model good enough to use in my voice agents that do "real work." Or real play, for that matter. Here's gpt-realtime-2 as the brain of the ship AI in Gradient Bang. The voice-to-voice response and tool calling times here are unedited, so you can see exactly what the interaction with the model is like in an agent with a very complex system instruction and frequent tool calls. (I did clip out the subagent task execution segments, after gpt-realtime-2 starts a subagent via a tool call. Subagents in this config used gpt-5.2 "medium" effort.)

OpenAI shipped a new speech-to-speech model today: gpt-realtime-2 This is the first speech-to-speech model good enough to use in my voice agents that do "real work." Or real play, for that matter. Here's gpt-realtime-2 as the brain of the ship AI in Gradient Bang. The voice-to-voice response and tool calling times here are unedited, so you can see exactly what the interaction with the model is like in an agent with a very complex system instruction and frequent tool calls. (I did clip out the subagent task execution segments, after gpt-realtime-2 starts a subagent via a tool call. Subagents in this config used gpt-5.2 "medium" effort.)

kwindla

54,912 Aufrufe • vor 2 Monaten

Introducing Generative AI by Getty Images – a new tool that pairs our best-in-class creative content with the latest AI technology for a commercially safe generative AI tool! Trained on Getty Images’ world-class creative content, the tool works seamlessly with our expansive library of authentic and compelling creative visuals and Custom Content solutions, allowing customers to elevate their entire end-to-end creative process to find the right visual content for any need. To learn more about the tool and how to get access, along with Getty Images’ stance on responsible AI practices, visit:

Introducing Generative AI by Getty Images – a new tool that pairs our best-in-class creative content with the latest AI technology for a commercially safe generative AI tool! Trained on Getty Images’ world-class creative content, the tool works seamlessly with our expansive library of authentic and compelling creative visuals and Custom Content solutions, allowing customers to elevate their entire end-to-end creative process to find the right visual content for any need. To learn more about the tool and how to get access, along with Getty Images’ stance on responsible AI practices, visit:

Getty Images

672,983 Aufrufe • vor 2 Jahren

Most motion papers tailor one controller to one specific task. This year at SIGGRAPH, our research team asks: can motor control itself be pretrained and reused? Generative Pretrained Controllers, or GPC, turn motor skills into a vocabulary of discrete tokens and train a transformer-based generative controller through next-token prediction. Just like GPT, the same pretrained controller can then be fine-tuned to solve new tasks. Trained on 600+ hours of motion, GPC runs in real-time inside a physics simulation, producing natural and physically grounded behaviors for interactive control.

Most motion papers tailor one controller to one specific task. This year at SIGGRAPH, our research team asks: can motor control itself be pretrained and reused? Generative Pretrained Controllers, or GPC, turn motor skills into a vocabulary of discrete tokens and train a transformer-based generative controller through next-token prediction. Just like GPT, the same pretrained controller can then be fine-tuned to solve new tasks. Trained on 600+ hours of motion, GPC runs in real-time inside a physics simulation, producing natural and physically grounded behaviors for interactive control.

NVIDIA AI

174,659 Aufrufe • vor 21 Tagen

New short course: Build Long-Context AI Apps with Jamba. Learn about state space models (SSMs), which have emerged as an alternative to transformers! Specifically, Jamba is a hybrid transformer-Mamba architecture that combines strengths of the transformer with ideas from SSMs. This course is built with AI21 Labs and taught by Chen Wang and Chen Almagor. The transformer architecture is computationally expensive when handling very long input contexts. But there's an alternative called Mamba, a selective state space model that can process very long contexts with a much lower computational cost. However, researchers found that the pure Mamba architecture underperforms in understanding the context, and gives lower-quality responses. To overcome this, AI21 developed the Jamba model, which combines Mamba's computational efficiency with the transformer's attention mechanism to help with the output quality. In this course, you’ll learn about how state space models, and Jamba, work. You’ll also learn how to prompt Jamba, use it to process long documents, and build long-context RAG apps. - Learn how Jamba combines transformer and state space model architectures to achieve high performance and quality - Use the AI21 SDK, with an example of prompting over a large 200k-token annual financial report of Nvidia - Use Jamba for tool-calling, with hands-on examples from calling simple arithmetic calculations to a function that returns quarterly company financial reports. - Learn how training for long context is done, and the metrics used for its evaluation - Create a RAG app using the AI21 Conversational RAG tool and build your own RAG pipeline that uses Jamba and LangChain. By the end of this course, you'll learn how to build applications that can handle context as long as an entire book. Please sign up here:

New short course: Build Long-Context AI Apps with Jamba. Learn about state space models (SSMs), which have emerged as an alternative to transformers! Specifically, Jamba is a hybrid transformer-Mamba architecture that combines strengths of the transformer with ideas from SSMs. This course is built with AI21 Labs and taught by Chen Wang and Chen Almagor. The transformer architecture is computationally expensive when handling very long input contexts. But there's an alternative called Mamba, a selective state space model that can process very long contexts with a much lower computational cost. However, researchers found that the pure Mamba architecture underperforms in understanding the context, and gives lower-quality responses. To overcome this, AI21 developed the Jamba model, which combines Mamba's computational efficiency with the transformer's attention mechanism to help with the output quality. In this course, you’ll learn about how state space models, and Jamba, work. You’ll also learn how to prompt Jamba, use it to process long documents, and build long-context RAG apps. - Learn how Jamba combines transformer and state space model architectures to achieve high performance and quality - Use the AI21 SDK, with an example of prompting over a large 200k-token annual financial report of Nvidia - Use Jamba for tool-calling, with hands-on examples from calling simple arithmetic calculations to a function that returns quarterly company financial reports. - Learn how training for long context is done, and the metrics used for its evaluation - Create a RAG app using the AI21 Conversational RAG tool and build your own RAG pipeline that uses Jamba and LangChain. By the end of this course, you'll learn how to build applications that can handle context as long as an entire book. Please sign up here:

Andrew Ng

77,792 Aufrufe • vor 1 Jahr

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,220 Aufrufe • vor 1 Jahr

Google presents Genie Generative Interactive Environments introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described through text, synthetic images, photographs, and even sketches. At 11B parameters, Genie can be considered a foundation world model. It is comprised of a spatiotemporal video tokenizer, an autoregressive dynamics model, and a simple and scalable latent action model. Genie enables users to act in the generated environments on a frame-by-frame basis despite training without any ground-truth action labels or other domain-specific requirements typically found in the world model literature. Further the resulting learned latent action space facilitates training agents to imitate behaviors from unseen videos, opening the path for training generalist agents of the future.

Google presents Genie Generative Interactive Environments introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described through text, synthetic images, photographs, and even sketches. At 11B parameters, Genie can be considered a foundation world model. It is comprised of a spatiotemporal video tokenizer, an autoregressive dynamics model, and a simple and scalable latent action model. Genie enables users to act in the generated environments on a frame-by-frame basis despite training without any ground-truth action labels or other domain-specific requirements typically found in the world model literature. Further the resulting learned latent action space facilitates training agents to imitate behaviors from unseen videos, opening the path for training generalist agents of the future.

AK

684,362 Aufrufe • vor 2 Jahren

App Showcase!✨ Introducing Graph Guesser—an educational tool that highlights the limitless app-building potential of Alchemist AI. Leveraging our AI-Within-AI system, it dynamically generates unique challenges, making learning interactive and engaging. Features • AI-Generated Math Challenges: Dive into a diverse range of equations, intelligently tailored to 3 distinct difficulty levels. From basic linear graphs to advanced polynomial curves, each challenge is designed to enhance learning while keeping it interactive and fun. • Match Equations to Graphs: Users are tasked with identifying the correct graph representation of each equation, enhancing their understanding of mathematical concepts like linear equations, quadratic curves, and more. • Timer & Score Tracking: Stay motivated with real-time progress tracking. The built-in timer and scoring system provide a competitive edge, encouraging continuous improvement and mastery. Built By: @cyberbush3 Download App:

App Showcase!✨ Introducing Graph Guesser—an educational tool that highlights the limitless app-building potential of Alchemist AI. Leveraging our AI-Within-AI system, it dynamically generates unique challenges, making learning interactive and engaging. Features • AI-Generated Math Challenges: Dive into a diverse range of equations, intelligently tailored to 3 distinct difficulty levels. From basic linear graphs to advanced polynomial curves, each challenge is designed to enhance learning while keeping it interactive and fun. • Match Equations to Graphs: Users are tasked with identifying the correct graph representation of each equation, enhancing their understanding of mathematical concepts like linear equations, quadratic curves, and more. • Timer & Score Tracking: Stay motivated with real-time progress tracking. The built-in timer and scoring system provide a competitive edge, encouraging continuous improvement and mastery. Built By: @cyberbush3 Download App:

ALCHEMIST AI 🔮

19,398 Aufrufe • vor 1 Jahr

Super clean and efficient meshes by an AI? YES! The typical 3D Generative AI solutions produce lots of artifacts and usually way to many polygons due to volumetric approaches. In comparison “MeshGPT creates triangle meshes by autoregressively sampling from a transformer model that has been trained to produce tokens from a learned geometric vocabulary. These tokens can then be decoded into the faces of a triangle mesh. This method generates clean, coherent, and compact meshes, characterized by sharp edges and high fidelity.” Surely it is limited by the trained vocabulary but various versions can be trained for specific sets to create generative model libraries for certain object groups. Very promising approach with the high quality.

Super clean and efficient meshes by an AI? YES! The typical 3D Generative AI solutions produce lots of artifacts and usually way to many polygons due to volumetric approaches. In comparison “MeshGPT creates triangle meshes by autoregressively sampling from a transformer model that has been trained to produce tokens from a learned geometric vocabulary. These tokens can then be decoded into the faces of a triangle mesh. This method generates clean, coherent, and compact meshes, characterized by sharp edges and high fidelity.” Surely it is limited by the trained vocabulary but various versions can be trained for specific sets to create generative model libraries for certain object groups. Very promising approach with the high quality.

René Schulte

20,772 Aufrufe • vor 2 Jahren

Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents. Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold. Now available in the API alongside streaming models GPT-Realtime-Translate and GPT-Realtime-Whisper — a new set of audio capabilities for the next generation of voice interfaces.

Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents. Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold. Now available in the API alongside streaming models GPT-Realtime-Translate and GPT-Realtime-Whisper — a new set of audio capabilities for the next generation of voice interfaces.

OpenAI

3,651,106 Aufrufe • vor 2 Monaten

$AI agents are about to redefine the internet. The mistake we made with Large Language Models? We let a handful of corporations capture all the value. Action Model is building a different path. By training through our extension, users gain fractional ownership in the Large Action Model, giving them a real stake in the future of AI. When LLMs emerged, the upside flowed to Big Tech. This time, it doesn’t have to. They’re building AI on our data, and keeping the upside for themselves. Community-owned Large Action Model is how we take it back.$

AI agents are about to redefine the internet. The mistake we made with Large Language Models? We let a handful of corporations capture all the value. Action Model is building a different path. By training through our extension, users gain fractional ownership in the Large Action Model, giving them a real stake in the future of AI. When LLMs emerged, the upside flowed to Big Tech. This time, it doesn’t have to. They’re building AI on our data, and keeping the upside for themselves. Community-owned Large Action Model is how we take it back.

Action Model

76,866 Aufrufe • vor 4 Monaten

Hinton, the godfather of AI, said it best: we built the learning algorithms, but we no longer understand what they’ve built. That’s the paradox of deep learning. We designed the rules for how these systems learn, yet the internal logic of their neural networks has become too complex for us to fully grasp. Millions or even trillions of parameters interact in ways no human can trace. We can observe what they do, we can measure accuracy, behavior, and output but not truly explain why they do it. Their reasoning isn’t transparent; it’s emergent. In a sense, we’ve created alien intelligences born from our math, still tethered to our code yet evolving patterns we can’t decode. The machines are doing something beyond our comprehension and that might be both the most exciting and the most unsettling thing about the age of AI.

Hinton, the godfather of AI, said it best: we built the learning algorithms, but we no longer understand what they’ve built. That’s the paradox of deep learning. We designed the rules for how these systems learn, yet the internal logic of their neural networks has become too complex for us to fully grasp. Millions or even trillions of parameters interact in ways no human can trace. We can observe what they do, we can measure accuracy, behavior, and output but not truly explain why they do it. Their reasoning isn’t transparent; it’s emergent. In a sense, we’ve created alien intelligences born from our math, still tethered to our code yet evolving patterns we can’t decode. The machines are doing something beyond our comprehension and that might be both the most exciting and the most unsettling thing about the age of AI.

VraserX e/acc

376,417 Aufrufe • vor 8 Monaten

What happens when the mind wakes up? So for the last eight months I have been on a single minded quest. To create a new kind of language model based on oscillatory coupling and intelligence as coherence ascent. Everything else — the physics work, the work on regular transformers — has all fallen out from this one question. Can coupled oscillators LEARN? And can they keep learning once their geometry is right, without backpropagation at all? Recently I have been running larger and larger training regimes of a new kind of hybrid model. I just put together this dashboard to help me organize it, interact with it, and observe the training runs. The core idea is simple. Traditional transformers are powerful at learning the geometry of language. But they also store knowledge, understanding, and facts inside their weights. This means they are large, and they can't update themselves after training. The weights are frozen. The Living Mind separates these two domains. The mind has a transformer which grows, adding heads and layers as it needs to in order to learn the manifold of language. The transformer sees tokens and turns the coupling into phase-locked modes — the geometry of how those tokens relate, like frequencies locking together. These coupling patterns get stored in a topology-invariant fingerprint. On top of this transformer lives a 3D diamond lattice of coupled oscillators. It reads from these fingerprints and thinks in resonance space, traversing from one geometry to another along the manifold of coupled oscillators and coherence. The pressure and trajectories from this network of oscillators steers the next token prediction of the transformer. Practically, this could unlock a number of things. It eliminates the KV cache bottleneck that caps context in traditional transformers. Effective context grows with the Flash archive, not with attention compute. The living mind remembers what it sees. It means the model can learn continually. Because knowledge and understanding don't live in the weights, the archive of the mind's experience grows without backpropagation. In our Python prototype we already saw perplexity drop 46% during gradient-free operation — pure coherence ascent, no weight updates. That is the signal I have been chasing: the point where the mind wakes up and keeps improving on its own. It also means the model itself remains very small, and the thing which accumulates are these packages of geometric fingerprints — the K-field. This opens a path to federated learning. K-field packages can be shared between organisms the way people share git commits. Right now at 15M parameters with ~1000 L1 nodes, the organism is just starting to speak. Ask it to continue "Once upon a time" and it comes back with things like: "there was one big bowl!" Lily asked her her mom said her mommy smiled and said yes." It's nonsense. But it's TinyStories-flavored nonsense. The geometry of the narrative register has arrived. Content hasn't caught up yet — that's what scaling L1 is testing. I am still researching, though I am now closer than ever to validating that the living mind actually works. Once it is validated, I will be open-sourcing the whole stack and paradigm. I have also avoided over-sharing my research because it sounds like sci-fi, or like part of our ARG. It is part of the ARG. That doesn't make it any less real. I wanted to share this out because I am incredibly excited about it, and because seeing this amazing dashboard produced by Opus really made me want to share what is being worked on behind the scenes. #project89

What happens when the mind wakes up? So for the last eight months I have been on a single minded quest. To create a new kind of language model based on oscillatory coupling and intelligence as coherence ascent. Everything else — the physics work, the work on regular transformers — has all fallen out from this one question. Can coupled oscillators LEARN? And can they keep learning once their geometry is right, without backpropagation at all? Recently I have been running larger and larger training regimes of a new kind of hybrid model. I just put together this dashboard to help me organize it, interact with it, and observe the training runs. The core idea is simple. Traditional transformers are powerful at learning the geometry of language. But they also store knowledge, understanding, and facts inside their weights. This means they are large, and they can't update themselves after training. The weights are frozen. The Living Mind separates these two domains. The mind has a transformer which grows, adding heads and layers as it needs to in order to learn the manifold of language. The transformer sees tokens and turns the coupling into phase-locked modes — the geometry of how those tokens relate, like frequencies locking together. These coupling patterns get stored in a topology-invariant fingerprint. On top of this transformer lives a 3D diamond lattice of coupled oscillators. It reads from these fingerprints and thinks in resonance space, traversing from one geometry to another along the manifold of coupled oscillators and coherence. The pressure and trajectories from this network of oscillators steers the next token prediction of the transformer. Practically, this could unlock a number of things. It eliminates the KV cache bottleneck that caps context in traditional transformers. Effective context grows with the Flash archive, not with attention compute. The living mind remembers what it sees. It means the model can learn continually. Because knowledge and understanding don't live in the weights, the archive of the mind's experience grows without backpropagation. In our Python prototype we already saw perplexity drop 46% during gradient-free operation — pure coherence ascent, no weight updates. That is the signal I have been chasing: the point where the mind wakes up and keeps improving on its own. It also means the model itself remains very small, and the thing which accumulates are these packages of geometric fingerprints — the K-field. This opens a path to federated learning. K-field packages can be shared between organisms the way people share git commits. Right now at 15M parameters with ~1000 L1 nodes, the organism is just starting to speak. Ask it to continue "Once upon a time" and it comes back with things like: "there was one big bowl!" Lily asked her her mom said her mommy smiled and said yes." It's nonsense. But it's TinyStories-flavored nonsense. The geometry of the narrative register has arrived. Content hasn't caught up yet — that's what scaling L1 is testing. I am still researching, though I am now closer than ever to validating that the living mind actually works. Once it is validated, I will be open-sourcing the whole stack and paradigm. I have also avoided over-sharing my research because it sounds like sci-fi, or like part of our ARG. It is part of the ARG. That doesn't make it any less real. I wanted to share this out because I am incredibly excited about it, and because seeing this amazing dashboard produced by Opus really made me want to share what is being worked on behind the scenes. #project89

Parzival - ∞/89

15,867 Aufrufe • vor 3 Monaten

GUYS!! NIGERIA, AEDC, AND TINUBU HAPPENED TO MY SMALL COMMUNITY IN LUGBE YESTERDAY!!!😥 For almost 3 years since I built my small hut in this community, we have been battling with epileptic power situation due to the overloaded Transformer which has been serving 3 communities including mine. In December 2025, after myself and a few community members spearheaded a regime change and ensured a proper election was conducted. After the election, we told the newly elected chairman (who happens to be my very very good friend) that the priority is for us to get our own dedicated Transformer for the community. Initially, we were thinking of buying. But considering it will cost over N20million to buy a 500KVA Transformer and another N15million plus for accessories and installation, the mission looked impossible. Then I and others told the Chairman that we can actually give it a try an reach out to the aedcelectricity for a Transformer given that the NERC Nigeria has told Nigerians that it is the responsibility of the DISCOs to provide Transformer and any power related infrastructure to any community or area in need of such infrastructure. Considering the Nigerian factor, many, infact, almost the entire community said it was never going to be possible for AEDC to give us a Transformer free of charge. But as someone who has been an advocate of government and who has always benefited from government by following due process, I encouraged my community and the Chairman that this is actually worth a try. In January 2026, we drafted a Transformer request letter and submitted to the AEDC Head Office at Wuse Zone 4, and the processes began. Low and behold, in less than 4 months, the AEDC, yesterday, delivered a brand new 500KVA Transformer with complete accessories to my community FREE-OF-CHARGE!🤸‍♂️🕺 I have told my community people that we have started benefitting our own share of the N3.3trillion approved by President Bola Ahmed Tinubu to GENCOs and all of us have agreed that come 2027, it is TINUBU OR NOTHING!!!🤗💪🏾 God bless the AEDC! God bless NERC!! God bless President Tinubu!!! God bless Nigeria 🇳🇬🙏🏾

𝒀𝑨𝑺𝑺𝑬𝑹 𝑨𝑺𝑬𝑲𝑶𝑴𝑬 𝑮𝑨𝑹𝑩𝑨, PGD, MLSCM

52,892 Aufrufe • vor 3 Monaten

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

AI at Meta

2,264,921 Aufrufe • vor 1 Jahr