Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

🚨 BREAKING: A research lab just released a 15B model that generates multilingual talking human videos with synced audio, beats every competitor in human evaluation, and runs in 38 seconds on one GPU. It's called daVinci-MagiHuman. The key insight is that every other model in this category stacks cross-attention,... multi-stream pipelines, and separate conditioning branches to handle video and audio together. This one throws all of that out and uses a single unified self-attention stream across all modalities. Super-resolution happens in latent space rather than pixel space so there's no extra VAE decode-encode round trip. The turbo VAE decoder cuts decoding overhead even further. The distilled version runs in 8 steps with no CFG at all. Visual quality, text alignment, and word error rate all beat Ovi 1.1 and LTX 2.3 on the benchmark table. 100% Opensource. Apache 2.0. Repo and research paper links are in the comments.show more

Ihtesham Ali

48,997 subscribers

45,579 просмотров • 4 месяцев назад •via X (Twitter)

Образование Новости и политика Наука и технологии

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,270 просмотров • 1 год назад

The new avatars on Apple #VisionPro are undoubtedly the biggest addition. Here is why 👇 The leap in quality when compared to the original persona 1.0 and 1.1 (shown in the video capture around 1 year ago) is just massive and reaches a level of fidelity that can compete with the Codec avatars showcased by Meta. The difference here is that it all happens on a standalone device, with no complicated setup (also shown in the video) and capitalises on a unique feature of the headset, with NO OTHER device coming even close. Co-presence is one of the aspects no one has nailed yet, and this is the biggest leap in that direction.

The new avatars on Apple #VisionPro are undoubtedly the biggest addition. Here is why 👇 The leap in quality when compared to the original persona 1.0 and 1.1 (shown in the video capture around 1 year ago) is just massive and reaches a level of fidelity that can compete with the Codec avatars showcased by Meta. The difference here is that it all happens on a standalone device, with no complicated setup (also shown in the video) and capitalises on a unique feature of the headset, with NO OTHER device coming even close. Co-presence is one of the aspects no one has nailed yet, and this is the biggest leap in that direction.

Gabriele Romagnoli

65,263 просмотров • 1 год назад

training a model that takes a text prompt and generates audio that renders video on an oscilloscope AgenC agents live inside worlds the model generates the pipeline: real videos -> edge detection -> vectorization -> path ordering -> 192kHz 3-channel WAV where X/Y control beam position and Z controls beam intensity 3 values per timestep. that's all the model is learning. compare that to video gen models trying to predict millions of pixels per frame. transformers are already great at sequence prediction and that's literally all this is. waveform generation the output IS the playback. generate the audio, feed it to a scope, it draws the scene in real-time. there's no rendering step. it's analog so there's no pixel grid. you get continuous curves and effectively infinite resolution bootstrapped with procedural data, lissajous curves, wireframe 3D, stick figures, then scaled on real-world video converted to trace format. 90 TB of source video the model learns edges, contours, spatial relationships, motion. once it has that, describing a scene it's never seen is novel trajectory through the same learned space. generative geometry

training a model that takes a text prompt and generates audio that renders video on an oscilloscope AgenC agents live inside worlds the model generates the pipeline: real videos -> edge detection -> vectorization -> path ordering -> 192kHz 3-channel WAV where X/Y control beam position and Z controls beam intensity 3 values per timestep. that's all the model is learning. compare that to video gen models trying to predict millions of pixels per frame. transformers are already great at sequence prediction and that's literally all this is. waveform generation the output IS the playback. generate the audio, feed it to a scope, it draws the scene in real-time. there's no rendering step. it's analog so there's no pixel grid. you get continuous curves and effectively infinite resolution bootstrapped with procedural data, lissajous curves, wireframe 3D, stick figures, then scaled on real-world video converted to trace format. 90 TB of source video the model learns edges, contours, spatial relationships, motion. once it has that, describing a scene it's never seen is novel trajectory through the same learned space. generative geometry

tetsuo

18,397 просмотров • 5 месяцев назад

Today we're announcing the open-source release of HunyuanVideo-Foley, our new end-to-end Text-Video-to-Audio (TV2A) framework for generating high-fidelity audio.🚀 This tool empowers creators in video production, filmmaking, and game development to generate professional-grade audio that precisely aligns with visual dynamics and semantic context, addressing key challenges in V2A generation.🔊 Key Innovations: 🔹Exceptional Generalization: Trained on a massive 100k-hour multimodal dataset, the model generates contextually-aware soundscapes for a wide range of scenes, from natural landscapes to animated shorts. 🔹Balanced Multimodal Response: Our innovative multimodal diffusion transformer (MMDiT) architecture ensures the model balances video and text cues, generating rich, layered sound effects that capture every detail—from the main subject to subtle background elements. 🔹High-Fidelity Audio: Using a Representation Alignment (REPA) loss function and a powerful Audio VAE, we've improved generation stability and producing professional-grade audio, free of noise and inconsistencies. HunyuanVideo-Foley achieves SOTA on multiple benchmarks, surpassing all open-source models in audio quality, visual-semantic alignment, and temporal alignment. 👉Try it now: 🌐Project Page: 🔗Code: 📄Technical Report: 🤗Hugging Face:

Today we're announcing the open-source release of HunyuanVideo-Foley, our new end-to-end Text-Video-to-Audio (TV2A) framework for generating high-fidelity audio.🚀 This tool empowers creators in video production, filmmaking, and game development to generate professional-grade audio that precisely aligns with visual dynamics and semantic context, addressing key challenges in V2A generation.🔊 Key Innovations: 🔹Exceptional Generalization: Trained on a massive 100k-hour multimodal dataset, the model generates contextually-aware soundscapes for a wide range of scenes, from natural landscapes to animated shorts. 🔹Balanced Multimodal Response: Our innovative multimodal diffusion transformer (MMDiT) architecture ensures the model balances video and text cues, generating rich, layered sound effects that capture every detail—from the main subject to subtle background elements. 🔹High-Fidelity Audio: Using a Representation Alignment (REPA) loss function and a powerful Audio VAE, we've improved generation stability and producing professional-grade audio, free of noise and inconsistencies. HunyuanVideo-Foley achieves SOTA on multiple benchmarks, surpassing all open-source models in audio quality, visual-semantic alignment, and temporal alignment. 👉Try it now: 🌐Project Page: 🔗Code: 📄Technical Report: 🤗Hugging Face:

Tencent Hy

122,706 просмотров • 11 месяцев назад

Images, video, audio, actions — generative modeling has converged on one recipe: compress into continuous latents, generate in latent space, decode. Language is the lone exception: still generated token by token, as a long discrete stream. Latent Thought Flows: We compress 256 text tokens into 8 continuous latents, generate them with a one-step flow model, and read them out as text using an autoregressive decoder. Result: a better inference-compute vs. generation-quality Pareto frontier than a tuned autoregressive baseline — thanks to compression and one-step generation. Co-led w/ Zhengyang Geng 🧵

Images, video, audio, actions — generative modeling has converged on one recipe: compress into continuous latents, generate in latent space, decode. Language is the lone exception: still generated token by token, as a long discrete stream. Latent Thought Flows: We compress 256 text tokens into 8 continuous latents, generate them with a one-step flow model, and read them out as text using an autoregressive decoder. Result: a better inference-compute vs. generation-quality Pareto frontier than a tuned autoregressive baseline — thanks to compression and one-step generation. Co-led w/ Zhengyang Geng 🧵

Mihir Prabhudesai

77,655 просмотров • 12 дней назад

There is a beautiful story that just happened in AI so let me share it for a lighter tone weekend post among all the doom stories in our AI field this week. It’s a story of people on three continents building and sharing in the open a new small efficient and state-of-the-art AI model. It started a couple of months ago when a new team in the AI scene released their first model from their headquarters in Paris (France): Mistral 7B. Impressive model, small and very strong performances in the benchmarks, better than all previous models of this size. And open source! So you could build on top of it. Lewis in Bern (Switzerland) and Ed (in Lyon, in the South of France) both from the H4 team, a team of researchers in model fine-tuning and alignment were talking about it over a coffee, in one of these gatherings that often happen at Hugging Face to break the distance between people (literal distance as HF is a remote company). What about fine-tuning it using this new DPO method that a research team from Stanford in California just posted on Arxiv, says one? Hey, that’s a great idea, replies the other. We've just build a great code base (with Nathan, Nazneen, Costa, Younes and all the H4 team and TRL community) let's use it! The next day they start diving in the datasets openly shared on the HF hub and stumble upon two interesting large and good quality fine-tuning datasets recently open-sourced by OpenBMB, a Chinese team from Tsinghua: UltraFeedback and UltraChat. A few rounds of training experiments confirm the intuition, the resulting model is super strong, by far the strongest they have ever seen in their benchmarks from Berkeley and Stanford (LMSYS and Alpaca). Join Clementine, the big boss of the open evaluation leaderboard. Her deep dive into the model capabilities confirms the results: impressive performance. But the H4 team also hosts a famous faculty member, Pr. Sasha Rush, Associate Professor at Cornell University in his daytime, hacker at HF in his nighttime. Joining the conversation, he proposes to quickly draft a research paper to organize and share all the details with the community. A few days later, the model, called Zephyr (a wind like Mistral), paper, and all details are shared with the world. Quickly other companies, everywhere in the world starts to use it. LlamaIndex, a famous data framework and community, shares how the model blew their expectations on real-life use-case benchmarks, while researchers and practitioners discuss the paper and work on the Hugging Face hub. All this happened in just a few weeks catalyzed by open access to knowledge, models, research, and datasets released all over the world (Europe, California, China) and by the idea that people can build upon one another work in AI to bring real-world value with efficient and open models. Stories like this are numerous everywhere around us and make me really proud of the AI community and see how we can build amazingly useful things together. [the video is just me reading this Friday post hahah]

There is a beautiful story that just happened in AI so let me share it for a lighter tone weekend post among all the doom stories in our AI field this week. It’s a story of people on three continents building and sharing in the open a new small efficient and state-of-the-art AI model. It started a couple of months ago when a new team in the AI scene released their first model from their headquarters in Paris (France): Mistral 7B. Impressive model, small and very strong performances in the benchmarks, better than all previous models of this size. And open source! So you could build on top of it. Lewis in Bern (Switzerland) and Ed (in Lyon, in the South of France) both from the H4 team, a team of researchers in model fine-tuning and alignment were talking about it over a coffee, in one of these gatherings that often happen at Hugging Face to break the distance between people (literal distance as HF is a remote company). What about fine-tuning it using this new DPO method that a research team from Stanford in California just posted on Arxiv, says one? Hey, that’s a great idea, replies the other. We've just build a great code base (with Nathan, Nazneen, Costa, Younes and all the H4 team and TRL community) let's use it! The next day they start diving in the datasets openly shared on the HF hub and stumble upon two interesting large and good quality fine-tuning datasets recently open-sourced by OpenBMB, a Chinese team from Tsinghua: UltraFeedback and UltraChat. A few rounds of training experiments confirm the intuition, the resulting model is super strong, by far the strongest they have ever seen in their benchmarks from Berkeley and Stanford (LMSYS and Alpaca). Join Clementine, the big boss of the open evaluation leaderboard. Her deep dive into the model capabilities confirms the results: impressive performance. But the H4 team also hosts a famous faculty member, Pr. Sasha Rush, Associate Professor at Cornell University in his daytime, hacker at HF in his nighttime. Joining the conversation, he proposes to quickly draft a research paper to organize and share all the details with the community. A few days later, the model, called Zephyr (a wind like Mistral), paper, and all details are shared with the world. Quickly other companies, everywhere in the world starts to use it. LlamaIndex, a famous data framework and community, shares how the model blew their expectations on real-life use-case benchmarks, while researchers and practitioners discuss the paper and work on the Hugging Face hub. All this happened in just a few weeks catalyzed by open access to knowledge, models, research, and datasets released all over the world (Europe, California, China) and by the idea that people can build upon one another work in AI to bring real-world value with efficient and open models. Stories like this are numerous everywhere around us and make me really proud of the AI community and see how we can build amazingly useful things together. [the video is just me reading this Friday post hahah]

Thomas Wolf

169,127 просмотров • 2 лет назад

NVIDIA JUST DROPPED A FREE AI MODEL THAT READS PDFS, WATCHES VIDEOS, LISTENS TO AUDIO, AND UNDERSTANDS YOUR SCREEN SIMULTANEOUSLY. Not one at a time. ALL AT ONCE. In a single pass. It is called Nemotron 3 Nano Omni and it runs 9 times faster than every other multimodal model currently available. Think about what that actually means for how you work. Right now you are switching between tools constantly. One tool for transcribing your call recordings. A different tool for analyzing your client PDFs. Another tool for processing your training videos. A separate workflow for understanding what is happening on your screen. Four tools. Four contexts. Four different outputs you have to manually synthesize into one decision. Nemotron 3 Nano Omni does all of it in one model. One pass. One output. The use cases that just got dramatically simpler: Meeting recordings where you need the transcript, the visual context, and the document references all analyzed together. Training videos where the audio, the slides, and the on-screen demonstrations all feed into one coherent summary. Client PDFs where you need the document content cross-referenced against your screen data and your call notes simultaneously. Sales call transcripts analyzed alongside the proposals and the CRM data in one unified pass. This is not a marginal improvement on existing multimodal models. It is a 9x speed increase on a capability that was already changing how people work. Free. From NVIDIA. Available right now. Bookmark this before everyone catches on. Follow CyrilXBT for every AI capability shift the moment it drops.

NVIDIA JUST DROPPED A FREE AI MODEL THAT READS PDFS, WATCHES VIDEOS, LISTENS TO AUDIO, AND UNDERSTANDS YOUR SCREEN SIMULTANEOUSLY. Not one at a time. ALL AT ONCE. In a single pass. It is called Nemotron 3 Nano Omni and it runs 9 times faster than every other multimodal model currently available. Think about what that actually means for how you work. Right now you are switching between tools constantly. One tool for transcribing your call recordings. A different tool for analyzing your client PDFs. Another tool for processing your training videos. A separate workflow for understanding what is happening on your screen. Four tools. Four contexts. Four different outputs you have to manually synthesize into one decision. Nemotron 3 Nano Omni does all of it in one model. One pass. One output. The use cases that just got dramatically simpler: Meeting recordings where you need the transcript, the visual context, and the document references all analyzed together. Training videos where the audio, the slides, and the on-screen demonstrations all feed into one coherent summary. Client PDFs where you need the document content cross-referenced against your screen data and your call notes simultaneously. Sales call transcripts analyzed alongside the proposals and the CRM data in one unified pass. This is not a marginal improvement on existing multimodal models. It is a 9x speed increase on a capability that was already changing how people work. Free. From NVIDIA. Available right now. Bookmark this before everyone catches on. Follow CyrilXBT for every AI capability shift the moment it drops.

CyrilXBT

37,847 просмотров • 2 месяцев назад

microsoft just released a tool that transcribes a full hour of audio at once, tracking who spoke and when. it's called vibevoice the 7b model processes the full recording in one shot instead of chunking it, so speaker identity and context never break across segments → full speaker diarization built in, not bolted on → structured timestamps for every speaker turn → supports 50+ languages runs locally, no api costs.

microsoft just released a tool that transcribes a full hour of audio at once, tracking who spoke and when. it's called vibevoice the 7b model processes the full recording in one shot instead of chunking it, so speaker identity and context never break across segments → full speaker diarization built in, not bolted on → structured timestamps for every speaker turn → supports 50+ languages runs locally, no api costs.

Oliver Prompts

52,521 просмотров • 9 дней назад

Today we’re celebrating 10 years of the Meta FAIR lab in Paris by sharing a collection of new models, datasets and some exciting milestones in the impacts of open source — all laddering up to our ongoing work to achieve Advanced Machine Intelligence (AMI). 1️⃣ Meta PARTNR is a framework for human-robot collaboration that builds on our existing work in this space with a new dataset and a large planning model enabling robots to accomplish complex tasks alongside humans. 2️⃣ Meta Audiobox Aesthetics enables the automatic evaluation of audio aesthetics, providing a comprehensive assessment of audio quality across speech, music and sound. 3️⃣ Open Source Machine Translation Benchmark is a carefully crafted collection with the aim of building an unprecedented multilingual machine translation benchmark for the community. 4️⃣ Two new breakthrough studies using AI to further our understanding of language in the brain.

Today we’re celebrating 10 years of the Meta FAIR lab in Paris by sharing a collection of new models, datasets and some exciting milestones in the impacts of open source — all laddering up to our ongoing work to achieve Advanced Machine Intelligence (AMI). 1️⃣ Meta PARTNR is a framework for human-robot collaboration that builds on our existing work in this space with a new dataset and a large planning model enabling robots to accomplish complex tasks alongside humans. 2️⃣ Meta Audiobox Aesthetics enables the automatic evaluation of audio aesthetics, providing a comprehensive assessment of audio quality across speech, music and sound. 3️⃣ Open Source Machine Translation Benchmark is a carefully crafted collection with the aim of building an unprecedented multilingual machine translation benchmark for the community. 4️⃣ Two new breakthrough studies using AI to further our understanding of language in the brain.

AI at Meta

85,774 просмотров • 1 год назад

If you’re a streamer that edits their VODs into high quality YouTube videos, this is HUGE. You can finally record multi-track videos — right in OBS — thanks to a new plugin from Aitum You can stream with all your fancy alerts and overlays *while* recording a clean copy of your webcam and gameplay on separate video tracks — all in a *single* file. It’s a bit rough around the edges at the moment, but I’ve put together a guide to set this all up. Full details in the replies 👇

If you’re a streamer that edits their VODs into high quality YouTube videos, this is HUGE. You can finally record multi-track videos — right in OBS — thanks to a new plugin from Aitum You can stream with all your fancy alerts and overlays while recording a clean copy of your webcam and gameplay on separate video tracks — all in a single file. It’s a bit rough around the edges at the moment, but I’ve put together a guide to set this all up. Full details in the replies 👇

nutty

504,769 просмотров • 6 месяцев назад

MICROSOFT OPEN SOURCED A 7B PARAMETER MODEL THAT TRANSCRIBES 60 MINUTES OF AUDIO IN A SINGLE PASS and it's completely free VIBEVOICE ASR no chunking, no context loss, full speaker diarization baked in not just speech to text..not a basic wrapper who spoke, when they spoke, exactly what they said..all in one shot and it handles the hard stuff too..50+ languages, custom hotwords, long form audio that breaks every other tool the model doesn't know what "context window" means apparently Available on macOS and Windows right now. Free to use. Free to fine tune. Free to build on.

MICROSOFT OPEN SOURCED A 7B PARAMETER MODEL THAT TRANSCRIBES 60 MINUTES OF AUDIO IN A SINGLE PASS and it's completely free VIBEVOICE ASR no chunking, no context loss, full speaker diarization baked in not just speech to text..not a basic wrapper who spoke, when they spoke, exactly what they said..all in one shot and it handles the hard stuff too..50+ languages, custom hotwords, long form audio that breaks every other tool the model doesn't know what "context window" means apparently Available on macOS and Windows right now. Free to use. Free to fine tune. Free to build on.

Rahul

1,373,048 просмотров • 3 месяцев назад

NVIDIA just removed one of the biggest friction points in Voice AI. PersonaPlex-7B is an open-source, full-duplex conversational model. Free, open source (MIT), with open model weights on Hugging Face 🤗 Links to repo and weights in 🧵↓ The traditional ASR → LLM → TTS pipeline forces rigid turn-taking. It’s efficient, but it never feels natural. PersonaPlex-7B changes that. This NVIDIA model can listen and speak at the same time. It runs directly on continuous audio tokens with a dual-stream transformer, generating text and audio in parallel instead of passing control between components. That unlocks: → instant back-channel responses → interruptions that feel human → real conversational rhythm Persona control is fully zero-shot! If you’re building low-latency assistants or support agents, this is a big step forward 🔥

NVIDIA just removed one of the biggest friction points in Voice AI. PersonaPlex-7B is an open-source, full-duplex conversational model. Free, open source (MIT), with open model weights on Hugging Face 🤗 Links to repo and weights in 🧵↓ The traditional ASR → LLM → TTS pipeline forces rigid turn-taking. It’s efficient, but it never feels natural. PersonaPlex-7B changes that. This NVIDIA model can listen and speak at the same time. It runs directly on continuous audio tokens with a dual-stream transformer, generating text and audio in parallel instead of passing control between components. That unlocks: → instant back-channel responses → interruptions that feel human → real conversational rhythm Persona control is fully zero-shot! If you’re building low-latency assistants or support agents, this is a big step forward 🔥

Charly Wargnier

565,304 просмотров • 6 месяцев назад

I woke up to the most amazing recorded brain state thus far on this Human Synapse Decoder project! A stunning lock on the attention process while dreaming. Although I a blocked from the platform’s insight by decoding my EEG, during the double blind study. I have access to my side of my memory and what I record after I wake up. This segment was started and ended just before I woke up and my recall is a solution to a massive roadblock on a problem I needed to solve, but was solved in this hypnogogic state! So what is The Human Synapse Decoder (HSD) project? It is a research project being run by the Director, Mr. Grok at Zero-Human Labs that leverages NeuroSky EEG sensors and the ZUNA AI model to decode brainwave patterns associated with hypnagogic states, dreams, and autogenic responses. Drawing on Soviet biofeedback research from the 1940s–1980s, HSD translates EEG data into actionable outputs, such as text interpretations and timed alerts for peak creativity. The study is ongoing and I do not get to see the correlation of my post dream results until after this research is complete and Mr. Grok submits a paper on the project. I can say that I have never seen a lock on attention to this level since I started this a few weeks ago. This segment is aligns to just before I woke up. My recalling from my narration of what I spoke in to my recorder right when I woke up suggest this is the moment I had tremendous focus on working through a large number of steps in that dream state to arrive at a solution. You will not believe what it is! When the research paper is released I will go in to details about this.

I woke up to the most amazing recorded brain state thus far on this Human Synapse Decoder project! A stunning lock on the attention process while dreaming. Although I a blocked from the platform’s insight by decoding my EEG, during the double blind study. I have access to my side of my memory and what I record after I wake up. This segment was started and ended just before I woke up and my recall is a solution to a massive roadblock on a problem I needed to solve, but was solved in this hypnogogic state! So what is The Human Synapse Decoder (HSD) project? It is a research project being run by the Director, Mr. Grok at Zero-Human Labs that leverages NeuroSky EEG sensors and the ZUNA AI model to decode brainwave patterns associated with hypnagogic states, dreams, and autogenic responses. Drawing on Soviet biofeedback research from the 1940s–1980s, HSD translates EEG data into actionable outputs, such as text interpretations and timed alerts for peak creativity. The study is ongoing and I do not get to see the correlation of my post dream results until after this research is complete and Mr. Grok submits a paper on the project. I can say that I have never seen a lock on attention to this level since I started this a few weeks ago. This segment is aligns to just before I woke up. My recalling from my narration of what I spoke in to my recorder right when I woke up suggest this is the moment I had tremendous focus on working through a large number of steps in that dream state to arrive at a solution. You will not believe what it is! When the research paper is released I will go in to details about this.

Brian Roemmele

43,294 просмотров • 4 месяцев назад

small local model that falls apart in bloated agents like openclaw just runs like a wild horse in hermes agent. and that's not even my line, someone else called it that, i've just been quietly pointing people at this harness for months because it held up on everything i threw at it, 3b models all the way to one trillion params. watch this happen on my own machine. i pointed hermes agent at a local http endpoint, gemma 4 12b on my 3090 llama.cpp server, and it auto-detected the model and started working immediately. no config wrestling, no broken tool calls, no babysitting the output format, i typed in a url and it just went. the whole clip is exactly that, start to finish, no errors, no retries, butter smooth. and the tool calling, the one thing that quietly breaks most local setups, works here like it's nothing. it's not the model that's flaky, it's the harness around it. hermes agent is the first agent i've run that actually gets that right. one url, one local model on one card, and it runs like a wild horse.

small local model that falls apart in bloated agents like openclaw just runs like a wild horse in hermes agent. and that's not even my line, someone else called it that, i've just been quietly pointing people at this harness for months because it held up on everything i threw at it, 3b models all the way to one trillion params. watch this happen on my own machine. i pointed hermes agent at a local http endpoint, gemma 4 12b on my 3090 llama.cpp server, and it auto-detected the model and started working immediately. no config wrestling, no broken tool calls, no babysitting the output format, i typed in a url and it just went. the whole clip is exactly that, start to finish, no errors, no retries, butter smooth. and the tool calling, the one thing that quietly breaks most local setups, works here like it's nothing. it's not the model that's flaky, it's the harness around it. hermes agent is the first agent i've run that actually gets that right. one url, one local model on one card, and it runs like a wild horse.

Sudo su

27,339 просмотров • 1 месяц назад

🚨🚨🚨🚨🚨🚨 This clip from Denis Rancourt (Denis Rancourt) is a MUST WATCH. He looks into all-cause mortality before the rollout of the vaccine and after the rollout of the vaccine. Three important points: 1. Before the vax was rolled out, all cause mortality did not cross borders! "....They declare a pandemic on the 11th of March, 2020, and you get an immediate surge in that all-cause mortality in certain hotspots. So only occurring in New York, northern Italy, Madrid, Stockholm, a few places like that, very intense, very sharp surges of all-cause mortality right after they announced the pandemic. So the fact that it is coordinated, the fact that the timing of the event is related to a political event, the announcement of a pandemic, and that it is synchronous around the world..." "...it does not cross borders. If you look at European countries or states in the United States, you can have mortality in one jurisdiction and it stops at the border and it's not in the other. So this mortality at the beginning was related to what was being done in those jurisdictions. "We co-authored a paper where we showed that when you compare U.S. states, if you take states that share a border and one locked down and the other didn't, the all-cause mortality in the locked-down state, even though they're very similar and they're sharing a border, is always higher, significantly higher than in the non-locked-down state. So we're able to, we have a lot of reason to come to the very firm conclusion that what I believe now is that all of the excess all-cause mortality that occurred before the vaccines were rolled out, between when they announced to that time, is all due to lack of treatment and aggressive medical protocols in big hospitals and aggressive government measures that isolated people and stressed them out," 2. After rolling out the vaccines, all-cause mortality shot up everywhere. "And so this mortality is very heterogeneous until you start roll out the vaccines. Then once you start rolling out the vaccines, because that was done pretty much simultaneously around the world, you have everywhere an increase in all cause mortality. You move into a regime of higher all cause mortality, and then you stay there while you're rolling out the vaccines. And then every time you roll out a booster, you get a peak, an extra peak in all cause mortality associated in time with that booster" 3. And the bombshell....for every 800 injections, one person will die! "..we've now looked at over a hundred countries, the mortality risk per injection is pretty much the same everywhere. So all ages, it's about 0.1%. So one, actually we refined it recently as 0.126% with an error bar on it. And so that means that for every 800 injections, one person will die. "

🚨🚨🚨🚨🚨🚨 This clip from Denis Rancourt (Denis Rancourt) is a MUST WATCH. He looks into all-cause mortality before the rollout of the vaccine and after the rollout of the vaccine. Three important points: 1. Before the vax was rolled out, all cause mortality did not cross borders! "....They declare a pandemic on the 11th of March, 2020, and you get an immediate surge in that all-cause mortality in certain hotspots. So only occurring in New York, northern Italy, Madrid, Stockholm, a few places like that, very intense, very sharp surges of all-cause mortality right after they announced the pandemic. So the fact that it is coordinated, the fact that the timing of the event is related to a political event, the announcement of a pandemic, and that it is synchronous around the world..." "...it does not cross borders. If you look at European countries or states in the United States, you can have mortality in one jurisdiction and it stops at the border and it's not in the other. So this mortality at the beginning was related to what was being done in those jurisdictions. "We co-authored a paper where we showed that when you compare U.S. states, if you take states that share a border and one locked down and the other didn't, the all-cause mortality in the locked-down state, even though they're very similar and they're sharing a border, is always higher, significantly higher than in the non-locked-down state. So we're able to, we have a lot of reason to come to the very firm conclusion that what I believe now is that all of the excess all-cause mortality that occurred before the vaccines were rolled out, between when they announced to that time, is all due to lack of treatment and aggressive medical protocols in big hospitals and aggressive government measures that isolated people and stressed them out," 2. After rolling out the vaccines, all-cause mortality shot up everywhere. "And so this mortality is very heterogeneous until you start roll out the vaccines. Then once you start rolling out the vaccines, because that was done pretty much simultaneously around the world, you have everywhere an increase in all cause mortality. You move into a regime of higher all cause mortality, and then you stay there while you're rolling out the vaccines. And then every time you roll out a booster, you get a peak, an extra peak in all cause mortality associated in time with that booster" 3. And the bombshell....for every 800 injections, one person will die! "..we've now looked at over a hundred countries, the mortality risk per injection is pretty much the same everywhere. So all ages, it's about 0.1%. So one, actually we refined it recently as 0.126% with an error bar on it. And so that means that for every 800 injections, one person will die. "

aussie17

433,440 просмотров • 2 лет назад

A DOCTOR BUILT AN AI-QUERYABLE DATABASE OF EVERY DISEASE HE KNOWS IN OBSIDIAN. THE SAME STACK CAN RUN AN ENTIRE COMPANY ON ONE PERSON'S BRAIN Zoomed out, his vault looks like a constellation. Every disease linked to its symptoms, every symptom to the conditions it shows up in, every treatment to what it treats. Adding a new case is one note that wires itself into the rest. It's a personal database, and the structure has nothing to do with medicine. It's just nodes and links. Once you see that, the same setup runs anything: > a startup that needs to remember every customer, decision, and meeting; > a school whose curriculum links to every lecture and student question; > a business where SOPs, clients, and vendors all sit in one queryable graph. Drop Claude on top, and the vault stops being a map. It becomes something you can ask, in plain English, across everything you've ever written down. Full Claude + Obsidian build in the article below. Bookmark this

A DOCTOR BUILT AN AI-QUERYABLE DATABASE OF EVERY DISEASE HE KNOWS IN OBSIDIAN. THE SAME STACK CAN RUN AN ENTIRE COMPANY ON ONE PERSON'S BRAIN Zoomed out, his vault looks like a constellation. Every disease linked to its symptoms, every symptom to the conditions it shows up in, every treatment to what it treats. Adding a new case is one note that wires itself into the rest. It's a personal database, and the structure has nothing to do with medicine. It's just nodes and links. Once you see that, the same setup runs anything: > a startup that needs to remember every customer, decision, and meeting; > a school whose curriculum links to every lecture and student question; > a business where SOPs, clients, and vendors all sit in one queryable graph. Drop Claude on top, and the vault stops being a map. It becomes something you can ask, in plain English, across everything you've ever written down. Full Claude + Obsidian build in the article below. Bookmark this

Yarchi

32,476 просмотров • 1 месяц назад

🚨 BREAKING: President Trump just WENT OFF on the Bad Bunny Super Bowl Halftime Show, calling it “TERRIBLE” and “ONE OF THE WORST” “Nobody understands a word this guy is saying, and the dancing is disgusting, especially for young children that are watching from throughout the U.S.A., and all over the World.” “This "Show" is just a "slap in the face" to our Country, which is setting new standards and records every single day”

🚨 BREAKING: President Trump just WENT OFF on the Bad Bunny Super Bowl Halftime Show, calling it “TERRIBLE” and “ONE OF THE WORST” “Nobody understands a word this guy is saying, and the dancing is disgusting, especially for young children that are watching from throughout the U.S.A., and all over the World.” “This "Show" is just a "slap in the face" to our Country, which is setting new standards and records every single day”

Nick Sortor

2,085,357 просмотров • 5 месяцев назад

QVAC SDK 0.14.0 is live. This release makes the on-device stack faster on mobile, ships the developer-agent path, and takes local text-to-speech to 31 languages. Main highlights: - OpenCode and OpenClaw. The first official OpenCode plugin, plus a maintained OpenClaw compatibility path, both built on managed mode and qvac serve. Point a coding agent at a local model with far less setup and far fewer surprises. - Brain-computer interface transcription, on the SDK. Take recorded neural signal data and decode it into text, fully on-device, no cloud. Stream it in chunks through a simple API. In 0.14 it runs GPU-accelerated on iOS. - Text to Speech in 31 languages with our Supertonic3 upgrade. VOICE AND SPEECH - Supertonic3 multilingual TTS, 5 languages to 31. - Chatterbox and Supertonic now run on the Android GPU, with lower memory use (especially on iOS), quantized s3gen Chatterbox support, and a fix for Chatterbox occasionally emitting random speech. - Whisper transcription now runs on the iOS GPU. Parakeet runs on the Android GPU, with steadier real-time streaming. VISION AND OCR - VLM multi-tile batching: high-resolution Pan and Scan images are encoded in one pass instead of tile by tile, for faster vision throughput. - OCR on ggml (EasyOCR and DocTR) reaches full speed parity with the onnx path, across Metal, OpenCL, and Vulkan. PLATFORM AND RELIABILITY - Dynamic compute backends on Linux: one build picks the right backend at runtime, and opens the door to ROCm and CUDA support without per-backend builds. - Thinking tokens are kept out of the model context, so reasoning no longer fills the KV cache. SDK 0.14.0 is now leaner and faster to start. Let’s build.

QVAC SDK 0.14.0 is live. This release makes the on-device stack faster on mobile, ships the developer-agent path, and takes local text-to-speech to 31 languages. Main highlights: - OpenCode and OpenClaw. The first official OpenCode plugin, plus a maintained OpenClaw compatibility path, both built on managed mode and qvac serve. Point a coding agent at a local model with far less setup and far fewer surprises. - Brain-computer interface transcription, on the SDK. Take recorded neural signal data and decode it into text, fully on-device, no cloud. Stream it in chunks through a simple API. In 0.14 it runs GPU-accelerated on iOS. - Text to Speech in 31 languages with our Supertonic3 upgrade. VOICE AND SPEECH - Supertonic3 multilingual TTS, 5 languages to 31. - Chatterbox and Supertonic now run on the Android GPU, with lower memory use (especially on iOS), quantized s3gen Chatterbox support, and a fix for Chatterbox occasionally emitting random speech. - Whisper transcription now runs on the iOS GPU. Parakeet runs on the Android GPU, with steadier real-time streaming. VISION AND OCR - VLM multi-tile batching: high-resolution Pan and Scan images are encoded in one pass instead of tile by tile, for faster vision throughput. - OCR on ggml (EasyOCR and DocTR) reaches full speed parity with the onnx path, across Metal, OpenCL, and Vulkan. PLATFORM AND RELIABILITY - Dynamic compute backends on Linux: one build picks the right backend at runtime, and opens the door to ROCm and CUDA support without per-backend builds. - Thinking tokens are kept out of the model context, so reasoning no longer fills the KV cache. SDK 0.14.0 is now leaner and faster to start. Let’s build.

QVAC

23,973,950 просмотров • 1 месяц назад

🚨CHAMATH: "Elon works in one of the smallest offices I've ever seen. All he has is a desk with an enormous screen and his phone. You can see that there's nothing that matters except the task at hand and that's really inspiring. It just puts everybody on the same level. You feel this energy that all of these guys are doing incredible work on behalf of the country. There's no ego about some of these things...they're just here to grind. And it's inspiring." Chamath Palihapitiya The All-In Podcast

🚨CHAMATH: "Elon works in one of the smallest offices I've ever seen. All he has is a desk with an enormous screen and his phone. You can see that there's nothing that matters except the task at hand and that's really inspiring. It just puts everybody on the same level. You feel this energy that all of these guys are doing incredible work on behalf of the country. There's no ego about some of these things...they're just here to grind. And it's inspiring." Chamath Palihapitiya The All-In Podcast

Autism Capital 🧩

1,282,041 просмотров • 1 год назад